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XYLANASES, NUCLEIC ACIDS ENCODING THEM AND 
METHODS FOR MAKING AND USING THEM 

CROSS-REFERENCE TO RELATED APPLICATIONS 
5 This application claims the benefit of priority under 35 U.S.C. § 1 19(e) of U.S. 

Provisional Application No. 60/389,299, filed June 14, 2002. The aforementioned 
application is explicitly incorporated herein by reference in its entirety and for all purposes. 

FIELD OF THE INVENTION 
This invention relates generally to enzymes, polynucleotides encoding the 
1 0 enzymes, the use of such polynucleotides and polypeptides pnd more specifically to enzymes 
having xylanase activity, e.g., catalyzing hydrolysis of internal p-l,4-xylosidic linkages or 
endo- p-l,4-glucanase linkages. 

BACKGROUND 

Xylanases (e.g., endo~l,4-beta-xylanase 5 EC 3.2.1.8) hydrolyze internal p- 

15 1 ,4-xylosidic linkages in xylan to produce smaller molecular weight xylose and xylo- 
oligomers. Xylans are polysaccharides formed from 1,4-P-glycoside-linked D- 
xylopyranoses. Xylanases are of considerable commercial value, being used in the food 
industry, for baking and fruit and vegetable processing, breakdown of agricultural waste, in 
the manufacture of animal feed and in pulp and paper production. Xylanases are formed by 

20 fungi and bacteria. 

Arabinoxylanase are major non-starch polysaccharides of cereals representing 
2.5 - 7.1% w/w depending on variety and growth conditions. The physicochemical 
properties of this polysaccharide are such that it gives rise to viscous solutions or even gels 
under oxidative conditions. In addition, arabinoxylans have high water-binding capacity and 

25 may have a role in protein foam stability. All of these characteristics present problems for 
several industries including brewing, baking, animal nutrition and paper manufacturing. In 
brewing applications, the presence of xylan results in wort filterability and haze formation 
issues. In baking applications (especially for cookies and crackers), these arabinoxylans 
create sticky doughs that are difficult to machine and reduce biscuit size. In addition, this 

30 carbohydrate is implicated in rapid rehydration of the baked product resulting in loss of 

crispiness and reduced shelf-life. For monogastric animal feed applications with cereal diets, 
arabinoxylan is a major contributing factor to viscosity of gut contents and thereby adversely 
affects the digestibility of the feed and animal growth rate. For ruminant animals, these 
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polysaccharides represent substantial components of fiber intake and more complete 
digestion of arabinoxylans would facilitate higher feed conversion efficiencies. 

Xylanases are currently used as additives (dough conditioners) in dough 
processing for the hydrolysis of water soluble arabinoxylan. In baking applications 
5 (especially for cookies and crackers), arabinoxylan creates sticky doughs that are difficult to 
machine and reduce biscuit size. In addition, this carbohydrate is implicated in rapid 
rehydration of the baked product resulting in loss of crispiness and reduced shelf-life. 

The enhancement of xylan digestion in animal feed may improve the 
availability and digestibility of valuable carbohydrate and protein feed nutrients. For 

10 monogastric animal feed applications with cereal diets, arabinoxylan is a major contributing 
factor to viscosity of gut contents and thereby adversely affects the digestibility of the feed 
and animal growth rate. For ruminant animals, these polysaccharides represent substantial 
components of fiber intake and more complete digestion would facilitate higher feed 
conversion efficiencies. It is desirable for animal feed xylanases to be active in the animal 

15 stomach. This requires a feed enzyme to have high activity at 37 °C and at low pH for 

monogastrics (pH 2-4) and near neutral pH for ruminants (pH 6.5-7). The enzyme should also 
possess resistance to animal gut xylanases and stability at the higher temperatures involved in 
feed pelleting. As such, there is a need in the art for xylanase feed additives for monogastric 
feed with high specific activity, activity at 35-40°C and pH 2-4, half life greater than 30 

20 minutes in SGF and a half-life > 5 minutes at 85 °C in formulated state. For ruminant feed, 
there is a need for xylanase feed additives that have a high specific activity, activity at 35- 
40°C and pH 6.5-7.0, half life greater than 30 minutes in SRF and stability as a concentrated 
dry powder. 

Xylanases are also used in a number of other applications. For example, 
25 xylanases are used in improving the quality and quantity of milk protein production in 

lactating cows (see, for example, Kung, L., et al, J. Dairy Science, 2000 Jan 83:115-122), 
increasing the amount of soluble saccharides in the stomach and small intestine of pigs (see, 
for example, van der Meulen, J. et al, Arch. Tierernahn 2001 54:101-115), improving late 
egg production efficiency and egg yields in hens (see, for example, Jaroni, D., et al, Poult. 
30 Sci., 1999 June 78:841-847). Additionally, xylanases have been shown to be useful in 

biobleaching and treatment of chemical pulps (see, for example, U.S. Pat. No. 5,202,249), 
biobleaching and treatment of wood or paper pulps (see, for example, U.S. Pat Nos. 
5,179,021, 5,116,746, 5,407,827, 5,405,769, 5,395,765, 5,369,024, 5,457,045, 5,434,071, 
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5,498,534, 5,591,304, 5,645,686, 5,725,732, 5,759,840, 5,834,301, 5,871,730 and 6,057,438) 
in reducing lignin in wood and modifying wood (see, for example, U.S. Pat. Nos. 5,486,468 
and 5,770,012) as flour, dough and bread improvers (see, for example, U.S. Pat. Nos. 
5,108,765 and 5,306,633) as feed additives and/or supplements, as set forth above (see, for 
5 example, U.S. Pat. Nos. 5,432,074, 5,429,828, 5,612,055, 5,720,971, 5,981,233, 5,948,667, 
6,099,844, 6,132,727 and 6,132,716), in manufacturing cellulose solutions (see, for example, 
U.S. Pat. No.5,760,21 1). Detergent compositions having xylanase activity are used for fruit, 
vegetables and/or mud and clay compounds (see, for example, U.S. Pat. No. 5,786,316). 

Xylanases are also useful in a method of use and composition of a 

1 0 carbohydrase and/or a xylanase for the manufacture of an agent for the treatments and/or 
prophylaxis of coccidiosis. The manufactured agent can be in the form of a cereal-based 
animal feed, (see, for example, U.S. Pat. No. 5,624,678) Additional uses for xylanases 
include use in the production of water soluble dietary fiber (see, for example, U.S. Pat. No. 
5,622,738), in improving the filterability, separation and production of starch (see, for 

15 example, U.S. Pat. Nos. 4,960,705 and 5,023,176), in the beverage industry in improving 
filterability of wort or beer (see, for example, U.S. Pat. No. 4,746,5 17), in an enzyme 
composition for promoting the secretion of milk of livestock and improving the quality of the 
milk (see, for example, U.S. Pat. No. 4,144,354), in reducing viscosity of plant material (see, 
for example, U.S. Pat. No. 5,874,274), in increasing viscosity or gel strength of food products 

20 such as jam, marmalade, jelly, juice, paste, soup, salsa, etc. (see, for example, U.S. Pat No. 
6,036,981). Xylanases may also be used in hydrolysis of hemicellulose for which it is 
selective, particularly in the presence of cellulose. Additionally, the cellulase rich retentate is 
suitable for the hydrolysis of cellulose (see, for example, U.S. Pat. No. 4,725,544). 

Various uses of xylanases include the production of ethanol (see, for 

25 example, PCT Application Nos. WO0043496 and W08 1 00857), in transformation of a 
microbe that produces ethanol (see, for example, PCT Application No. W099/46362), in 
production of oenological tannins and enzymatic composition (see, for example, PCT 
Application No. WOO 164830), in stimulating the natural defenses of plants (see, for example, 
PCT Application No. WO0130161), in production of sugars from hemicellulose substrates 

30 (see, for example, PCT Application No. WO9203541), in the cleaning of fruit, vegetables, 
mud or clay containing soils (see, for example, PCT Application No. W09613568), in 
cleaning beer filtration membranes (see, for example, PCT Application No. W09623579), in 
a method of killing or inhibiting microbial cells (see, for example, PCT Application No. 
WO9732480) and in determining the characteristics of process waters from wood pulp 
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bleaching by using the ratios of two UV absorption measurements and comparing the spectra 
(see, for example, PCT Application No. WO9840721). 

With regard to xylanases used in the paper and pulp industry, xylanases have 
been isolated from many sources. In particular, see U.S. Patents No. 6,083,733 and 
5 6,140,095 and 6,346,407. In particular, it is noted that U.S. Patents No. 6,140,095 addresses 
alkali-tolerant xylanases. However, it is noted that there remains a need in the art for 
xylanases to be used in the paper and pulp industry where the enzyme is active in the 
temperature range of 65°C to 75°C and at a pH of approximately 10. Additionally, an 
enzyme of the invention useful in the paper and pulp industry would decrease the need for 
10 bleaching chemicals, such as chlorine dioxide. 

The publications discussed herein are provided solely for their disclosure 
prior to the filing date of the present application. Nothing herein is to be construed as an 
admission that the invention is not entitled to antedate such disclosure by virtue of prior 
15 invention. 

SUMMARY OF THE INVENTION 
The invention provides isolated or recombinant nucleic acids comprising a 
nucleic acid sequence having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 

20 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 
74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) 
sequence identity to an exemplary nucleic acid of the invention, e.g., SEQ ID NO:l, SEQ ID 
NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13, SEQ 

25 ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, 
SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:3 1, SEQ ID NO:33, SEQ ID NO:35, SEQ ID 
NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, 
SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID 
NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, 

30 SEQ ID NO:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID 
NO:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91, 
SEQ ID NO:93, SEQ ID NO:95, SEQ ED NO:97, SEQ ID NO:99, SEQ ED NO:101, SEQ ID • 
NO:103, SEQ ID NO:105, SEQ ID NO:107, SEQ ED NO:109, SEQ ID NO:l 11, SEQ ID 
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NO:113, SEQ ID NO:115, SEQ ID NO:117, SEQ ED NO:119, SEQ ID NO:121, SEQ ID 
NO:123, SEQ ID NO:125, SEQ ID NO:127, SEQ ID NO:129, SEQ ID NO:131, SEQ ID 
NO:133, SEQIDNO:135, SEQIDNO:137, SEQ IDNO:139, SEQIDNO:141, SEQ ID 
NO:143, SEQ ID NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ID NO:151, SEQ ID 
5 NO:153, SEQ ID NO:155, SEQ ID NO:157, SEQ IDNO:199, SEQ ID NO:161, SEQ ID 
NO:163, SEQ ID NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID NO:171, SEQ ID 
NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID NO-.181, SEQ ID 
NO:183, SEQ ID NO:185, SEQ ID NO:187, SEQ ID NO:189, SEQ ID NO:191, SEQ ID 
NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID NO:201, SEQ ID 

10 NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID NO:21 1, SEQ ID 
NO:213, SEQ ID NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ID NO:221, SEQ ID 
NO:223, SEQ ID NO:225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID NO:231, SEQ ID 
NO:233, SEQ ID NO:235, SEQ ID NO:237, SEQ ID NO:239, SEQ ID NO:241, SEQ ID 
NO:243, SEQ ID NO:245, SEQ ID NO:247, SEQ ID NO:249, SEQ ID NO:251, SEQ ID 

15 NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:261, SEQ ID 
NO:263, SEQ ID NO:265, SEQ ID NO:267, SEQ ID NO:269, SEQ ID NO:271, SEQ ID 
NO:273, SEQ ID NO:275, SEQ ID NO:277, SEQ ID NO:279, SEQ ID NO:281, SEQ ID 
NO:283, SEQ ID NO:285, SEQ ID NO:287, SEQ ID NO:289, SEQ ID NO:291, SEQ ID 
NO:293, SEQ ID NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ID NO.301 , SEQ ID 

20 NO:303, SEQ ID NO:305, SEQ ID NO:307, SEQ ID NO:309, SEQ ID NO:311, SEQ ID 
NO:313, SEQ ID NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ID NO:321, SEQ ID 
NO:323, SEQ ID NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ID NO:331, SEQ ID 
NO:333, SEQ ID NO:335, SEQ ID NO:337, SEQ ID NO:339, SEQ ID NO:341, SEQ ID 
NO:343, SEQ ID NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ID NO:351, SEQ ID 

25 NO:353, SEQ ID NO:355, SEQ ID NO:357, SEQ ID NO:359, SEQ ID NO:361, SEQ ID 
NO:363, SEQ ID NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ID NO:371, SEQ ID 
NO:373, SEQ ID NO:375, SEQ ID NO:377 or SEQ ID NO:379, over a region of at least 
about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 
600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 1300, 1350, 

30 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750, 1800, 1850, 1900, 1950, 2000, 2050, 2100, 
2200, 2250, 2300, 2350, 2400, 2450, 2500, or more residues, encodes at least one polypeptide 
having a xylanase activity, and the sequence identities are determined by analysis with a 
sequence comparison algorithm or by a visual inspection. 
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Exemplary nucleic acids of the invention also include isolated or recombinant 
nucleic acids encoding a polypeptide having a sequence as set forth in SEQ ID NO:2, SEQ 
ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, 
SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID 
5 NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, 
SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID 
NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ DD NO:54, SEQ ID NO:56, SEQ ID NO:58, 
SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID 
NO:70, SEQ ED NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:80, 

10 SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, SEQ DD NO:88, SEQ ID NO:90, SEQ ID 
NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO:100, SEQ ID NO: 102, 
SEQ ID NO:104, SEQ ID NO:106, SEQ ID NO:108, SEQ ID NO:l 10, SEQ ID NO:112, 
SEQ ID NO: 1 14, SEQ ID NO: 1 16, SEQ ID NO: 1 1 8, SEQ ID NO: 120, SEQ ID NO: 122, 
SEQ ID NO:124, SEQ ID NO:126, SEQ ID NO:128, SEQ ID NO:130, SEQ ID NO:132; 

15 SEQ ID NO:134; SEQ ED NO:136; SEQ ID NO:138; SEQ ID NO:140; SEQ ID NO:142; 
SEQ ID NO:144; NO:146, SEQ ID NO:148, SEQ DD NO:150, SEQ ID NO:152, SEQ DD 
NO:154, SEQ DD NO:156, SEQ DD NO:158, SEQ DD NO:160, SEQ DD NO:162, SEQ DD 
NO:164, SEQ ID NO:166, SEQ DD NO:168, SEQ DD NO:170, SEQ DD NO:172, SEQ DD 
NO:174, SEQ DD NO:176, SEQ DD NO:178, SEQ DD NO.180, SEQ DDNO:182, SEQ DD 

20 NO:184, SEQ DD NO:186, SEQ DD NO:188, SEQ DD NO:190, SEQ DD NO:192, SEQ DD 
NO:194, SEQ ED NO:196, SEQ DD NO:198, SEQ DD NO:200, SEQ DD NO:202, SEQ DD 
NO:204, SEQ ED NO:206, SEQ DD NO:208, SEQ DD NO:210, SEQ DD NO:212, SEQ LD 
NO.-214, SEQ DD NO:216, SEQ DD NO:218, SEQ DD NO:220, SEQ DD NO:222, SEQ DD 
NO:224, SEQ DD NO:226, SEQ DD NO:228, SEQ DD NO:230, SEQ DD NO:232, SEQ DD 

25 NO:234, SEQ DD NO:236, SEQ DD NO:238, SEQ DD NO:240, SEQ DD NO:242, SEQ DD 
NO:244, SEQ DD NO:246, SEQ DD NO:248, SEQ DD NO:250, SEQ DD NO:252, SEQ DD 
NO:254, SEQ DD NO:256, SEQ DD NO:258, SEQ DD NO:260, SEQ DD NO:262, SEQ DD 
NO:264, SEQ DD NO:266, SEQ DD NO:268, SEQ DD NO:270, SEQ DD NO:272, SEQ DD 
NO:274, SEQ DD NO:276, SEQ DD NO:278, SEQ DD NO:280, SEQ DD NO:282, SEQ DD 

30 NO:284, SEQ DD NO:286, SEQ DD NO:288, SEQ DD NO.290, SEQ DD NO:292, SEQ DD 
NO:294, SEQ DD NO:296, SEQ DD NO:298, SEQ DD NO.300, SEQ DD NO:302, SEQ DD 
NO.304, SEQ DD NO:306, SEQ DD NO:308, SEQ DD NO:310, SEQ DD NO:312, SEQ DD 
NO:314, SEQ DD NO:316, SEQ DD NO:318, SEQ DD NO:320, SEQ DD NO:322, SEQ DD 
NO:324, SEQ DD NO:326, SEQ DD NO:328, SEQ DD NO:330, SEQ DD NO:332, SEQ DD 
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NO:334, SEQ ID NO:336, SEQ ID NO:338, SEQ ID NO:340, SEQ ID NO:342, SEQ ID 
NO:344, SEQ ID NO:346, SEQ ID NO:348, SEQ ID NO:350, SEQ ID NO:352, SEQ ID 
NO:354, SEQ ID NO:356, SEQ ID NO:358, SEQ ID NO:360, SEQ ID NO:362, SEQ ID 
NO:364, SEQ ID NO:366, SEQ ID NO:368, SEQ ID NO:370, SEQ ID NO:372, SEQ ID 
5 NO:374, SEQ ID NO:376, SEQ ID NO:378 or SEQ ID NO:380, and subsequences thereof 
and variants thereof. In one aspect, the polypeptide has a xylanase activity. 

In one aspect, the invention also provides xylanase-encoding nucleic acids 
with a common novelty in that they are derived from mixed cultures. The invention provides 
xylanase-encoding nucleic acids isolated from mixed cultures comprising a nucleic acid 

10 sequence having at least about 10, 15, 20, 25, 30, 35, 40, 45, 50%, 51%, 52%, 53%, 54%, 

55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 
71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 
87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or 
complete (100%) sequence identity to an exemplary nucleic acid of the invention, e.g., SEQ 

15 ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ 
ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID 
NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, 
SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:5t, SEQ ID NO:53, SEQ ID NO:55, SEQ ID 

20 NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, 
SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, SEQ ID 
NO:79, SEQ ED NO:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, 
SEQ ID NO:91, SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID 
NO:101, SEQ ID NO.103, SEQ ID NO:105, SEQ ID NO:107, SEQ ID NO:109, SEQ ID 

25 NO:lll, SEQ ID NO:113, SEQ ID NO:115, SEQ ID NO:117, SEQ ID NO:119, SEQ ID 
NO:121, SEQ ID NO:123, SEQ ID NO:125, SEQ ID NO:127, SEQ ID NO:129, SEQ ID 
NO:131, SEQ ID NO:133, SEQ ID NO:135, SEQ ID NO:137, SEQ ID NO:139, SEQ ID 
NO:141, SEQ ID NO:143, SEQ ID NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ID 
NO:151, SEQ ID NO:153, SEQ ID NO:155, SEQ ID NO:157, SEQ ID NO:199, SEQ ID 

30 NO:161, SEQ ID NO:163, SEQ ID NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID 
NO:171, SEQ ID NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID 
NO:181, SEQ ID NO:183, SEQ ID NO:185, SEQ ID NO:187, SEQ ID NO:189, SEQ ID 
NO:191, SEQ ID NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID 
NO:201, SEQ ID NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID 
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NO:211, SEQ ID NO:213, SEQ ID NO:21S, SEQ ID NO:217, SEQ ID NO:219, SEQ ID 
NO:221, SEQ ID NO:223, SEQ ID NO:225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID 
NO:231, SEQ ID NO:233, SEQ ID NO:235, SEQ ID NO:237, SEQ ID NO:239, SEQ ID 
NO:241, SEQ ID NO:243, SEQ ID NO:245, SEQ ID NO:247, SEQ ID NO:249, SEQ ID 
5 NO:25 1 , SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID 
NO:261, SEQ ID NO:263, SEQ ID NO:265, SEQ ID NO:267, SEQ ID NO:269, SEQ ID 
NO:271, SEQ ID NO:273, SEQ ID NO:275, SEQ ID NO:277, SEQ ID NO:279, SEQ ID 
NO:281, SEQ ID NO:283, SEQ ID NO:285, SEQ ID NO:287, SEQ ID NO:289, SEQ ID 
NO:291, SEQ ID NO:293, SEQ ID NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ID 

10 NO:301, SEQ ID NO:303, SEQ ID NO:305, SEQ ID NO:307, SEQ ID NO:309, SEQ ID 
N0:31 1, SEQ ID NO:313, SEQ ID NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ID 
NO:321, SEQ ID NO:323, SEQ ID NO:325, SEQ JD NO:327, SEQ ID NO:329, SEQ ID 
NO:331, SEQ ID NO:333, SEQ ID NO:335, SEQ ID NO:337, SEQ ID NO.339, SEQ ID 
NO:341, SEQ ID NO:343, SEQ ID NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ID 

15 NO:351,SEQIDNO:353,SEQIDNO:355,SEQIDNO:357,SEQIDNO:359,SEQID 

NO:361,SEQIDNO:363,SEQIDNO:365,SEQIDNO:367,SEQIDNO:369,SEQID ? 
NO:371, SEQ JD NO:373, SEQ ID NO:375, SEQ ID NO:377 or SEQ ID NO:379, over a 
region of at least about 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 
700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, or more. 

20 In one aspect, the invention also provides xylanase-encoding nucleic acids 

with a common novelty in that they are derived from an environmental source, e.g., mixed 
environmental sources, a bacterial source and/or an archaeal source, see Table 3, below. In 
one aspect, the invention provides xylanase-encoding nucleic acids isolated from an 
environmental source, e.g., a mixed environmental source, a bacterial source and/or an 

25 archaeal source, comprising a nucleic acid sequence having at least about 10, 15, 20, 25, 30, 
35, 40, 45, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 
64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 
80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to an exemplary 

30 nucleic acid of the invention over a region of at least about 50, 75, 100, 150, 200, 250, 300, 
350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 
1200 or more, residues, wherein the nucleic acid encodes at least one polypeptide having a 
xylanase activity, and the sequence identities are determined by analysis with a sequence 
comparison algorithm or by a visual inspection. 
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In one aspect, the invention also provides xylanase-encoding nucleic acids 
with a common novelty in that they are derived from a common glycosidase family, e.g., 
family 5, 6, 8, 10, 11, 26 or 30, as set forth in Table 5, below. 

In one aspect, the sequence comparison algorithm is a BLAST version 2.2.2 
5 algorithm where a filtering setting is set to blastall -p blastp -d "nr pataa" -F F, and all other 
options are set to default. 

Another aspect of the invention is an isolated or recombinant nucleic acid 
including at least 10 consecutive bases of a nucleic acid sequence of the invention, sequences 
substantially identical thereto, and the sequences complementary thereto. 
10 In one aspect, the xylanase activity comprises catalyzing hydrolysis of internal 

p-l,4-xylosidic linkages. In one aspect, the xylanase activity comprises an endo-l,4-beta- 
xylanase activity. 

In one aspect, the xylanase activity comprises hydrolyzing a xylan to produce 
a smaller molecular weight xylose and xylo-oligomer. In one aspect, the xylan comprises an 

15 arabinoxylan, such as a water soluble arabinoxylan. The water soluble arabinoxylan can 
comprise a dough or a bread product. 

In one aspect, the xylanase activity comprises hydrolyzing polysaccharides 
comprising 1,4-P-glycoside-linked D-xylopyranoses. In one aspect, the xylanase activity 
comprises hydrolyzing hemicelluloses. In one aspect, the xylanase activity comprises 

20 hydrolyzing hemicelluloses in a wood or paper pulp or a paper product In one aspect, the 
invention provides methods for reducing lignin in a wood or wood product comprising 
contacting the wood or wood product with a polypeptide of the invention. 

In one aspect, the xylanase activity comprises catalyzing hydrolysis of xylans 
in a beverage or a feed or a food product. The feed or food product can comprise a cereal- 

25 based animal feed, a wort or a beer, a milk or a milk product, a fruit or a vegetable. In one 
aspect, the invention provides a food, feed or beverage or a beverage precursor comprising a 
polypeptide of the invention. The food can be a dough or a bread product The beverage or a 
beverage precursor can be a beer or a wort. 

In one aspect, the invention provides methods of dough conditioning 

30 comprising contacting a dough or a bread product with at least one polypeptide of the 

invention under conditions sufficient for conditioning the dough. In one aspect, the invention 
provides methods of beverage production comprising administration of at least one 
polypeptide of the invention to a beverage or a beverage precursor under conditions sufficient 
for decreasing the viscosity of the beverage. 
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In one aspect, the xylanase activity comprises catalyzing hydrolysis of xylans 
in a cell, e.g., a plant cell or a microbial cell. 

In one aspect, the isolated or recombinant nucleic acid encodes a polypeptide 
having a xylanase activity that is thermostable. The polypeptide can retain a xylanase 
5 activity under conditions comprising a temperature range of between about 37°C to about 
95°C; between about 55°C to about 85°C, between about 70°C to about 95°C, or, between 
about 90°C to about 95°C. 

In another aspect, the isolated or recombinant nucleic acid encodes a 
polypeptide having a xylanase activity that is thermotolerant. The polypeptide can retain a 

10 xylanase activity after exposure to a temperature in the range from greater than 37°C to about 
95°C or anywhere in the range from greater than 55°C to about 85°C. The polypeptide can 
retain a xylanase activity after exposure to a temperature in the range between about 1°C to 
about 5°C, between about 5°C to about 15°C, between about 15°C to about 25°C, between 
about 25°C to about 37°C, between about 37°C to about 95°C, between about 55°C to about 

15 85°C, between about 70°C to about 75°C, or between about 90°C to about 95°C, or more. In 
one aspect, the polypeptide retains a xylanase activity after exposure to a temperature in the j 
range from greater than 90°C to about 95°C at pH 4.5. 

The invention provides isolated or recombinant nucleic acids comprising a 
sequence that hybridizes under stringent conditions to a nucleic acid comprising a sequence 

20 of the invention, e.g., a sequence as set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13, SEQ ID NO:15, SEQ ID 
NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 

25 SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, 
SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ID 
NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91, SEQ ID NO:93, 
SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID 

30 NO:105, SEQ ID NO:107, SEQ ID NO.109, SEQ IDNO.lll, SEQ ID NO: 11 3, SEQ ID 
NO:115, SEQ ID NO:117, SEQ ID NO:l 19, SEQ ID NO:121, SEQ ID NO:123, SEQ ID 
NO:125, SEQ ID NO:127, SEQ ID NO:129, SEQ ID NO:131, SEQ ID NO:133, SEQ ID 
NO:135, SEQ ID NO:137, SEQ ID NO:139, SEQ ED NO:141, SEQ ID NO:143, SEQ ID 
NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ED NO:151, SEQ ID NO:153, SEQ ID 
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NO:155, SEQ ID NO:157, SEQ ID NO:199, SEQ ID NO:161, SEQ ID NO:163, SEQ ID 
NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID NO:171, SEQ ID NO:173, SEQ ED 
NO.-175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID NO:181, SEQ ID NO:183, SEQ ID 
NO:185, SEQ ID NO:187, SEQ ED NO:189, SEQ ED N0:191, SEQ ED NO:193, SEQ ED 
5 NO:195, SEQ ED NO:197, SEQ ED NO:199, SEQ ED NO:201, SEQ ED NO.203, SEQ ED 
NO:205, SEQ ED NO:207, SEQ ED NO:209, SEQ ED N0:21 1, SEQ ED NO:213, SEQ ED 
NO:215, SEQ ED NO:217, SEQ ED NO:219, SEQ ED NO:221, SEQ ED NO:223, SEQ ED 
NO:225, SEQ ED NO:227, SEQ ED NO:229, SEQ ED NO:231, SEQ ED NO:233, SEQ ED 
NO:235, SEQ ED NO:237, SEQ ED NO:239, SEQ ED NO:241, SEQ ED NO:243, SEQ ED 

10 NO:245, SEQ ID NO:247, SEQ ED NO:249, SEQ ED NO:25 1, SEQ ED NO:253, SEQ ED 
NO:255, SEQ ED NO:257, SEQ ED NO:259, SEQ ED NO:261, SEQ ED NO:263, SEQ ED 
NO:265, SEQ ED NO:267, SEQ ED NO:269, SEQ ED NO:271, SEQ ED NO:273, SEQ ID 
NO:275, SEQ ED NO:277, SEQ ED NO:279, SEQ ED NO:281, SEQ ED NO:283, SEQ ED 
NO:285, SEQ ED NO:287, SEQ ED NO:289, SEQ ED NO:291, SEQ ED NO:293, SEQ ED 

15 NO:295, SEQ ED NO:297, SEQ ED NO:299, SEQ ED NO:301, SEQ ED NO:303, SEQ ED 
NO:305, SEQ ED NO:307, SEQ ED NO:309, SEQ ED NO:31 1, SEQ ED NO:313, SEQ ED 
NO:315, SEQ ED NO:317, SEQ ED NO:319, SEQ ED NO:321, SEQ ED NO:323, SEQ ID 
NO:325, SEQ ED NO:327, SEQ ED NO:329, SEQ ED NO:331, SEQ ED NO:333, SEQ ID 
NO:335, SEQ ED NO:337, SEQ ED NO:339, SEQ ED NO:341, SEQ ED NO:343, SEQ ED 

20 NO.-345, SEQ ED NO:347, SEQ ED NO:349, SEQ ED NO:351, SEQ ED NO:353, SEQ ED 
NO:355, SEQ ED NO:357, SEQ ED NO:359, SEQ ED NO:361, SEQ ED NO:363, SEQ ED 
NO:365, SEQ ED NO:367, SEQ EO NO:369, SEQ ED NO:371, SEQ ED NO:373, SEQ ED 
NO:375, SEQ ID NO:377 or SEQ ED NO:379, or fragments or subsequences thereof. In one 
aspect, the nucleic acid encodes a polypeptide having a xylanase activity. The nucleic acid 

25 can be at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1 100, 1 150, 1200 or more 
residues in length or the full length of the gene or transcript. In one aspect, the stringent 
conditions include a wash step comprising a wash in 0.2X SSC at a temperature of about 
65°C for about 15 minutes. 

30 The invention provides a nucleic acid probe for identifying a nucleic acid 

encoding a polypeptide having a xylanase activity, wherein the probe comprises at least about 
10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 
350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 or more, consecutive 
bases of a sequence comprising a sequence of the invention, or fragments or subsequences 
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thereof, wherein the probe identifies the nucleic acid by binding or hybridization. The probe 
can comprise an oligonucleotide comprising at least about 10 to 50, about 20 to 60, about 30 
to 70, about 40 to 80, or about 60 to 100 consecutive bases of a sequence comprising a 
sequence of the invention, or fragments or subsequences thereof. 
5 The invention provides a nucleic acid probe for identifying a nucleic acid 

encoding a polypeptide having a xylanase activity, wherein the probe comprises a nucleic 
acid comprising a sequence at least about 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 
250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 or more 
residues having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 

10 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 
. 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 
93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to a 
nucleic acid of the invention, wherein the sequence identities are determined by analysis with 
a sequence comparison algorithm or by visual inspection. 

15 The probe can comprise an oligonucleotide comprising at least about 10 to 50, 

about 20 to 60, about 30 to 70, about 40 to 80, or about 60 to 1 00 consecutive bases of a 
nucleic acid sequence of the invention, or a subsequence thereof. 

The invention provides an amplification primer pair for amplifying a nucleic 
acid encoding a polypeptide having a xylanase activity, wherein the primer pair is capable of 

20 amplifying a nucleic acid comprising a sequence of the invention, or fragments or 

subsequences thereof. One or each member of the amplification primer sequence pair can 
comprise an oligonucleotide comprising at least about 10 to 50 consecutive bases of the 
sequence, or about 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or 
more consecutive bases of the sequence. 

25 The invention provides amplification primer pairs, wherein the primer pair 

comprises a first member having a sequence as set forth by about the first (the 5') 12, 13, 14, 
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more residues of a nucleic acid 
of the invention, and a second member having a sequence as set forth by about the first (the 
5') 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more residues of 

30 the complementary strand of the first member. 

The invention provides xylanase-encoding nucleic acids generated by 
amplification, e.g., polymerase chain reaction (PCR), using an amplification primer pair of 
the invention. The invention provides xylanases generated by amplification, e.g., polymerase 
chain reaction (PCR), using an amplification primer pair of the invention. The invention 
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provides methods of making axylanaseby amplification, e.g., polymerase chain reaction 
(PCR), using an amplification primer pair of the invention. In one aspect, the amplification 
primer pair amplifies a nucleic acid from a library, e.g., a gene library, such as an 
environmental library. 

5 The invention provides methods of amplifying a nucleic acid encoding a 

polypeptide having a xylanase activity comprising amplification of a template nucleic acid 
with an amplification primer sequence pair capable of amplifying a nucleic acid sequence of 
the invention, or fragments or subsequences thereof. 

The invention provides expression cassettes comprising a nucleic acid of the 

10 invention or a subsequence thereof. In one aspect, the expression cassette can comprise the 
nucleic acid that is operably linked to a promoter. The promoter can be a viral, bacterial, 
mammalian or plant promoter. In one aspect, the plant promoter can be a potato, rice, corn, 
wheat, tobacco or barley promoter. The promoter can be a constitutive promoter. The 
constitutive promoter can comprise CaMV35S. In another aspect, the promoter can be an 

15 inducible promoter. In one aspect, the promoter can be a tissue-specific promoter or an 

environmentally regulated or a developmentally regulated promoter. Thus, the promoter can 
be, e.g., a seed-specific, a leaf-specific, a root-specific, a stem-specific or an abscission- 
induced promoter. In one aspect, the expression cassette can further comprise a plant or plant 
virus expression vector. 

20 The invention provides cloning vehicles comprising an expression cassette 

(e.g., a vector) of the invention or a nucleic acid of the invention. The cloning vehicle can be 
a viral vector, a plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage or an 
artificial chromosome. The viral vector can comprise an adenovirus vector, a retroviral 
vector or an adeno-associated viral vector. The cloning vehicle can comprise a bacterial 

25 artificial chromosome (BAC), a plasmid, a bacteriophage PI -derived vector (PAC), a yeast 
artificial chromosome (YAC), or a mammalian artificial chromosome (MAC). 

The invention provides transformed cell comprising a nucleic acid of the 
invention or an expression cassette (e.g., a vector) of the invention, or a cloning vehicle of the 
invention. In one aspect, the transformed cell can be a bacterial cell, a mammalian cell, a 

30 fungal cell, a yeast cell, an insect cell or a plant cell. In one aspect, the plant cell can be a 
cereal, a potato, wheat, rice, corn, tobacco or barley cell. 

The invention provides transgenic non-human animals comprising a nucleic 
acid of the invention or an expression cassette (e.g., a vector) of the invention. In one aspect, 
the animal is a mouse. 
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The invention provides transgenic plants comprising a nucleic acid of the 
invention or an expression cassette (e.g., a vector) of the invention. The transgenic plant can 
be a cereal plant, a corn plant, a potato plant, a tomato plant, a wheat plant, an oilseed plant, a 
rapeseed plant, a soybean plant, a rice plant, a barley plant or a tobacco plant. 
5 The invention provides transgenic seeds comprising a nucleic acid of the 

invention or an expression cassette (e.g., a vector) of the invention. The transgenic seed can 
be a cereal plant, a com seed, a wheat kernel, an oilseed, a rapeseed, a soybean seed, a palm 
kernel, a sunflower seed, a sesame seed, a peanut or a tobacco plant seed. 

The invention provides an antisense oligonucleotide comprising a nucleic acid 

10 sequence complementary to or capable of hybridizing under stringent conditions to a nucleic 
acid of the invention. The invention provides methods of inhibiting the translation of a 
xylanase message in a cell comprising administering to the cell or expressing in the cell an 
antisense oligonucleotide comprising a nucleic acid sequence complementary to or capable of 
hybridizing under stringent conditions to a nucleic acid of the invention. In one aspect, the 

15 antisense oligonucleotide is between about 10 to 50, about 20 to 60, about 30 to 70, about 40 
to 80, or about 60 to 100 bases in length. 

The invention provides methods of inhibiting the translation of a xylanase 
message in a cell comprising administering to the cell or expressing in the cell an antisense 
oligonucleotide comprising a nucleic acid sequence complementary to or capable of 

20 hybridizing under stringent conditions to a nucleic acid of the invention. The invention 

provides double-stranded inhibitory RNA (RNAi) molecules comprising a subsequence of a 
sequence of the invention. In one aspect, the RNAi is about 15, 16, 17, 18, 19, 20, 21, 22, 23, 
24, 25 or more duplex nucleotides in length. The invention provides methods of inhibiting 
the expression of a xylanase in a cell comprising administering to the cell or expressing in the 

25 cell a double-stranded inhibitory RNA (iRNA), wherein the RNA comprises a subsequence 
of a sequence of the invention. 

The invention provides an isolated or recombinant polypeptide comprising an 
amino acid sequence having at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 
58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 

30 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) 
sequence identity to an exemplary polypeptide or peptide of the invention over a region of at 
least about 25, 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350 or more 
residues, or over the full length of the polypeptide, and the sequence identities are determined 

14 



WO 03/106654 PCT/US03/19153 
by analysis with a sequence comparison algorithm or by a visual inspection. Exemplary 
polypeptide or peptide sequences of the invention include SEQ ID NO:2, SEQ ID NO:4, SEQ 
ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, 
SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID 
5 NO:28, SEQ ID 1*0:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ED NO:36, SEQ ID NO:38, 
SEQ ID NO:40, SEQ ID NO:42, SEQ ED NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ED 
NO:50, SEQ ED NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, 
SEQ ID NO:62, SEQ ED NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID 
NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, 

10 SEQ ID NO:84, SEQ ED NO:86, SEQ ID NO:88, SEQ ID NO:90, SEQ ID NO:92, SEQ ID 
NO:94, SEQ ED NO:96, SEQ ID NO:98, SEQ ID NO:100, SEQ ID NO:102, SEQ ID 
NO:104, SEQ ID NO:106, SEQ ID NO:108, SEQ ED NO:110, SEQ ED NO:l 12, SEQ ID 
NO:114, SEQ ID NO:116, SEQ ID NO:118, SEQ ID NO:120, SEQ ID NO: 122, SEQ ID 
NO:124, SEQ ED NO:126, SEQ ED NO:128, SEQ ID NO:130, SEQ ED NO:132; SEQ ED 

15 NO:134;SEQIDNO:136;SEQIDNO:138;SEQIDNO:140;SEQIDNO:142;SEQID 
NO:l44; NO:146, SEQ ID NO: 148, SEQ ID NO:150, SEQ ED NO:152, SEQ ED NO:154, 
SEQ ID NO:156, SEQ ID NO:158, SEQ ED NO:160, SEQ ID NO:162, SEQ ID NO:164, 
SEQ ID NO:166, SEQ ID NO:l68, SEQ ED NO:170, SEQ ID NO:172, SEQ ID NO:174, 
SEQ ID NO:176, SEQ ID NO:178, SEQ ID NO:180, SEQ ED NO:182, SEQ ID NO:184, 

20 SEQ ID NO-.186, SEQ ID NO:188, SEQ ID NO:190, SEQ ID NO:192, SEQ ID NO:194, 
SEQ ID NO:196, SEQ ID NO:198, SEQ ID NO-.200, SEQ ID NO.202, SEQ ID NO:204, 
SEQ ID NO:206, SEQ ID NO.208, SEQ ID NO.210, SEQ ID NO:212, SEQ ID NO:214, 
SEQ ID NO:216, SEQ ID NO:218, SEQ ID NO.220, SEQ ED NO:222, SEQ ID NO:224, 
SEQ ID NO-.226, SEQ ID NO:228, SEQ ID NO:230, SEQ ID NO:232, SEQ ED NO:234, 

25 SEQ ID NO:236, SEQ ED NO:238, SEQ ID NO:240, SEQ ID NO:242, SEQ ID NO:244, 
SEQ ID NO:246, SEQ ED NO:248, SEQ ID NO:250, SEQ ID NO:252, SEQ ID NO:254, 
SEQ ID NO:256, SEQ ID NO:258, SEQ ED NO:260, SEQ ID NO:262, SEQ ID NO:264, 
SEQ ED NO:266, SEQ ED NO:268, SEQ ID NO.270, SEQ ID NO:272, SEQ ID NO:274, 
SEQ ED NO:276, SEQ ED NO:278, SEQ ID NO:280, SEQ ED NO:282, SEQ ED NO:284, 

30 SEQ ID NO:286, SEQ ID NO:288, SEQ ED NO:290, SEQ ID NO:292, SEQ ID NO:294, 
SEQ ED NO:296, SEQ ID NO:298, SEQ ID NO:300, SEQ ID NO:302, SEQ ED NO:304, 
SEQ ED NO.306, SEQ ED NO:308, SEQ ID NO:310, SEQ ID NO:312, SEQ ED NO:314, 
SEQ ID NO:3l6, SEQ ED NO:318, SEQ ID NO:320, SEQ ID NO:322, SEQ ID NO:324, 
SEQ ED NO:326, SEQ ID NO:328, SEQ ID NO:330, SEQ ED NO:332, SEQ ID NO:334, 
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SEQ ID NO:336, SEQ ID NO:338, SEQ ID NO:340, SEQ ID NO:342, SEQ ID NO:344, 
SEQ ID NO:346, SEQ ID NO:348, SEQ ID NO:350, SEQ ID NO:352, SEQ ID NO:354, 
SEQ ID NO:356, SEQ ID NO:358, SEQ ID NO:360, SEQ ID NO:362, SEQ ID NO:364, 
SEQ ID NO:366, SEQ ID NO:368, SEQ ID NO:370, SEQ ID NO:372, SEQ ID NO:374, 
5 SEQ ID NO:376, SEQ ID NO:378 or SEQ ID NO:380, and subsequences thereof and 

variants thereof. Exemplary polypeptides also include fragments of at least about 10, 15, 20, 
25, 30, 35, 40, 45, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600 or more 
residues in length, or over the full length of an enzyme. Exemplary polypeptide or peptide 
sequences of the invention include sequence encoded by a nucleic acid of the invention. 

10 Exemplary polypeptide or peptide sequences of the invention include polypeptides or 

peptides specifically bound by an antibody of the invention. In one aspect, a polypeptide of 
the invention has at least one xylanase activity. 

Another aspect of the invention provides an isolated or recombinant 
polypeptide or peptide including at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 

15 80, 85, 90, 95 or 1 00 or more consecutive bases of a polypeptide or peptide sequence of the 
invention, sequences substantially identical thereto, and the sequences complementary 
thereto. The peptide can be, e.g., an immunogenic fragment, a motif (e.g., a binding site), a 
signal sequence, a prepro sequence or an active site. 

The invention provides isolated or recombinant nucleic acids comprising a 

20 sequence encoding a polypeptide having a xylanase activity and a signal sequence, wherein 
the nucleic acid comprises a sequence of the invention. The signal sequence can be derived 
from another xylanase or a non-xylanase (a heterologous) enzyme. The invention provides 
isolated or recombinant nucleic acids comprising a sequence encoding a polypeptide having a 
xylanase activity, wherein the sequence does not contain a signal sequence and the nucleic 

25 acid comprises a sequence of the invention. 

In one aspect, the xylanase activity comprises catalyzing hydrolysis of internal 
P-l,4-xylosidic linkages. In one aspect, the xylanase activity comprises an endo-l,4-beta- 
xylanase activity. In one aspect, the xylanase activity comprises hydrolyzing a xylan to 
produce a smaller molecular weight xylose and xylo-oligomer. In one aspect, the xylan 

30 comprises an arabinoxylan, such as a water soluble arabinoxylan. The water soluble 
arabinoxylan can comprise a dough or a bread product. 

In one aspect, the xylanase activity comprises hydrolyzing polysaccharides 
comprising 1,4-fi-glycoside-linked D-xylopyranoses. In one aspect, the xylanase activity 

16 
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comprises hydrolyzing hemicelluloses. In one aspect, the xylanase activity comprises 
hydrolyzing hemicelluloses in a wood or paper pulp or a paper product. 

La one aspect, the xylanase activity comprises catalyzing hydrolysis of xylans 
in a feed or a food product The feed or food product can comprise a cereal-based animal 
feed, a wort or a beer, a milk or a milk product, a fruit or a vegetable. 

In one aspect, the xylanase activity comprises catalyzing hydrolysis of xylans 
in a cell, e.g., a plant cell or a microbial celL 

In one aspect, the xylanase activity is thermostable. The polypeptide can 
retain a xylanase activity under conditions comprising a temperature range of between about 
1°C to about 5°C, between about 5°C to about 15°C, between about 15°C to about 25°C, 
between about 25°C to about 37°C, between about 37°C to about 95°C, between about 55°C to 
about 85°C, between about 70°C to about 75°C, or between about 90°C to about 95°C, or 
more. In another aspect, the xylanase activity can be thermotolerant. The polypeptide can 
retain a xylanase activity after exposure to a temperature in the range from greater than 37°C 
to about 95°C, or in the range from greater than 55°C to about 85°C. In one aspect, the 
polypeptide can retain a xylanase activity after exposure to a temperature in the range from 
greater than 90°C to about 95°C at pH 4.5. 

In one aspect, the isolated or recombinant polypeptide can comprise the 
polypeptide of the invention that lacks a signal sequence. In one aspect, the isolated or 
recombinant polypeptide can comprise the polypeptide of the invention comprising a 
heterologous signal sequence, such as a heterologous xylanase or non-xylanase signal 
sequence. 

In one aspect, the invention provides chimeric proteins comprising a first 
domain comprising a signal sequence of the invention and at least a second domain. The 
protein can be a fusion protein. The second domain can comprise an enzyme. The enzyme 
can be a xylanase. 

The invention provides chimeric polypeptides comprising at least a first 
domain comprising signal peptide (SP), a prepro sequence and/or a catalytic domain (CD) of 
the invention and at least a second domain comprising a heterologous polypeptide or peptide, 
wherein the heterologous polypeptide or peptide is not naturally associated with the signal 
peptide (SP), prepro sequence and/ or catalytic domain (CD). In one aspect, the heterologous 
polypeptide or peptide is not a xylanase. The heterologous polypeptide or peptide can be 
amino terminal to, carboxy terminal to or on both ends of the signal peptide (SP), prepro 
sequence and/or catalytic domain (CD). 
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The invention provides isolated or recombinant nucleic acids encoding a 
chimeric polypeptide, wherein the chimeric polypeptide comprises at least a first domain 
comprising signal peptide (SP), aprepro domain and/or a catalytic domain (CD) of the 
invention and at least a second domain comprising a heterologous polypeptide or peptide, 
5 wherein the heterologous polypeptide or peptide is not naturally associated with the signal 
peptide (SP), prepro domain and/ or catalytic domain (CD). 

The invention provides isolated or recombinant signal sequences (e.g., signal 
peptides) consisting of a sequence as set forth in residues 1 to 14, 1 to 15, 1 to 16, 1 to 17, 1 
to 18, 1 to 19, 1 to 20, 1 to 21, 1 to 22, 1 to 23, 1 to 24, 1 to 25, 1 to 26, 1 to 27, 1 to 28, 1 to 

10 28, 1 to 30, 1 to 31, 1 to 32, 1 to 33, 1 to 34, 1 to 35, 1 to 36, 1 to 37, 1 to 38, 1 to 40, 1 to 41, 
1 to 42, 1 to 43 or 1 to 44, of a polypeptide of the invention, e.g., SEQ ID NO:2, SEQ ID 
NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ 
ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID 

15 NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQlDNO:48, 
SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 
NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, 
SEQ ID NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO: 80, SEQ ID 
NO:82, SEQ ID NO:84, SEQ ID NO:86, SEQ ID NO:88, SEQ ID NO:90, SEQ ID NO:92, 

20 SEQ ID NO:94, SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO:100, SEQ ID NO:102, SEQ ID' 
NO:104, SEQ ID NO:106, SEQ ID NO:108, SEQ ID NO:110, SEQ ID NO:112, SEQ ID 
NO:114, SEQ ID NO:116, SEQ ID NO:118, SEQ ID NO:120, SEQ ID NO:122, SEQ ID 
NO: 124, SEQ ID NO: 126, SEQ ID NO: 128, SEQ ID NO: 130, SEQ ID NO: 132; SEQ ID 
NO:134; SEQ ID NO:136; SEQ ID NO:138; SEQ ID NO:140; SEQ ID NO:142; SEQ ID 

25 NO:144; NO:146, SEQ ID NO:148, SEQ ID NO:150, SEQ ID NO:152, SEQ ID NO:154, 
SEQ ID NO:156, SEQ ID NO:158, SEQ ED NO.160, SEQ ID NO:162, SEQ ID NO:164, 
SEQ ID NO:166, SEQ ID NO:168, SEQ ID NO:170, SEQ ID NO:172, SEQ ID NO:174, 
SEQ ID NO:176, SEQ ID NO:178, SEQ ID NO.T80, SEQ ID NO:182, SEQ ID NO:184, 
SEQ ID NO:186, SEQ ID NO:188, SEQ ID NO:190, SEQ ID NO:192, SEQ ID NO:194, 

30 SEQ ED NO:196, SEQ ED NO: 198, SEQ ID NO:200, SEQ ID NO:202, SEQ ID NO:204, 
SEQ ED NO:206, SEQ ID NO:208, SEQ ED NO:210, SEQ ID NO:212, SEQ ID NO:214, 
SEQ ED NO:216, SEQ ED NO:218, SEQ ID NO:220, SEQ ED NO:222, SEQ ED NO:224, 
SEQ ED NO:226, SEQ ED NO:228, SEQ ID NO:230, SEQ ED NO:232, SEQ ED NO:234, 
SEQ ID NO:236, SEQ ID NO:238, SEQ ED NO:240, SEQ ED NO:242, SEQ ID NO:244, 
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SEQ ID NO:246, SEQ ID NO:248, SEQ ID NO:250, SEQ ID NO:252, SEQ ID NO:254, 
SEQ ID NO:256, SEQ ID NO:258, SEQ ID NO:260, SEQ ID NO:262, SEQ ID NO:264, 
SEQ ID NO:266, SEQ ID NO:268, SEQ ID NO:270, SEQ ID NO:272, SEQ ID NO:274, 
SEQ ID NO:276, SEQ ID NO:278, SEQ ID NO:280, SEQ ID NO:282, SEQ ID NO:284, 
SEQ ID NO:286, SEQ ID NO:288, SEQ ID NO:290, SEQ ID NO:292, SEQ ID NO:294, 
SEQ ID NO:296, SEQ ID NO:298, SEQ ID NO:300, SEQ ID NO:302, SEQ ID NO:304, 
SEQ ID NO:306, SEQ ID NO:308, SEQ ID NO:310, SEQ ID NO:312, SEQ ID NO:314, 
SEQ ID NO:316, SEQ ED NO:318, SEQ ID NO:320, SEQ ID NO:322, SEQ ID NO:324, 
SEQ ID NO:326, SEQ ID NO:328, SEQ ID NO:330, SEQ ID NO:332, SEQ ID NO:334, 
SEQ ID NO:336, SEQ ID NO:338, SEQ ID NO:340, SEQ ID NO:342, SEQ ID NO:344, 
SEQ ID NO:346, SEQ ID NO:348, SEQ ID NO:350, SEQ ID NO:352, SEQ ID NO:354, 
SEQ ID NO:356, SEQ ID NO:358, SEQ ID NO:360, SEQ ID NO:362, SEQ ID NO:364, 
SEQ ID NO:366, SEQ ID NO:368, SEQ ID NO:370, SEQ ID NO:372, SEQ ID NO:374, 
SEQ ID NO:376, SEQ ID NO:378 or SEQ ID NO:380. 

In one aspect, the xylanase activity comprises a specific activity at about 37°C 
in the range from about 1 to about 1200 units per milligram of protein, or, about 100 to about 
1000 units per milligram of protein. In another aspect, the xylanase activity comprises a 
specific activity from about 100 to about 1000 units per milligram of protein, or, from about 
500 to about 750 units per milligram of protein. Alternatively, the xylanase activity 
comprises a specific activity at 37°C in the range from about 1 to about 750 units per 
milligram of protein, or, from about 500 to about 1200 units per milligram of protein, hi one 
aspect, the xylanase activity comprises a specific activity at 37°C in the range from about 1 to 
about 500 units per milligram of protein, or, from about 750 to about 1000 units per 
milligram of protein. In another aspect, the xylanase activity comprises a specific activity at 
37°C in the range from about 1 to about 250 units per milligram of protein. Alternatively, the 
xylanase activity comprises a specific activity at 37°C in the range from about 1 to about 100 
units per milligram of protein, hi another aspect, the thermotolerance comprises retention of 
at least half of the specific activity of the xylanase at 37°C after being heated to the elevated 
temperature. Alternatively, the thennotolerance can comprise retention of specific activity at 
37°C in the range from about 1 to about 1200 units per milligram of protein, or, from about 
500 to about 1000 units per milligram of protein, after being heated to the elevated 
temperature. In another aspect, the thermotolerance can comprise retention of specific 
activity at 37°C in the range from about 1 to about 500 units per milligram of protein after 
being heated to the elevated temperature. 
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The invention provides the isolated or recombinant polypeptide of the 
invention, wherein the polypeptide comprises at least one giycosylation site. In one aspect, 
glycosylation can be an N-linked giycosylation. In one aspect, the polypeptide can be 
glycosylated after being expressed in a P. pastoris or a S. pombe. 

In one aspect, the polypeptide can retain a xylanase activity under conditions 
comprising about pH 6.5, pH 6, pH 5.5, pH 5, pH 4.5 or pH 4. In another aspect, the 
polypeptide can retain a xylanase activity under conditions comprising about pH 7, pH 7.5 
pH 8.0, pH 8.5, pH 9, pH 9.5, pH 10, pH 10.5 or pH 1 1. In one aspect, the polypeptide can 
retain a xylanase activity after exposure to conditions comprising about pH 6.5, pH 6, pH 5.5, 
pH 5, pH 4.5 or pH 4. In another aspect, the polypeptide can retain a xylanase activity after 
exposure to conditions comprising about pH 7, pH 7.5 pH 8.0, pH 8.5, pH 9, pH 9.5, pH 10, 
pH10.5orpHll. 

The invention provides protein preparations comprising a polypeptide of the 
invention, wherein the protein preparation comprises a liquid, a solid or a gel. 

The invention provides heterodimers comprising a polypeptide of the 
invention and a second protein or domain. The second member of the heterodimer can be a . 
different phospholipase, a different enzyme or another protein. In one aspect, the second 
domain can be a polypeptide and the heterodimer can be a fusion protein. In one aspect, the 
second domain can be an epitope or a tag. In one aspect, the invention provides homodimers 
comprising a polypeptide of the invention. 

The invention provides immobilized polypeptides having a xylanase activity, 
wherein the polypeptide comprises a polypeptide of the invention, a polypeptide encoded by 
a nucleic acid of the invention, or a polypeptide comprising a polypeptide of the invention 
and a second domain. In one aspect, the polypeptide can be immobilized on a cell, a metal, a 
resin, a polymer, a ceramic, a glass, a microelectrode, a graphitic particle, a bead, a gel, a 
plate, an array or a capillary tube. 

The invention provides arrays comprising an immobilized nucleic acid of the 
invention. The invention provides arrays comprising an antibody of the invention. 

The invention provides isolated or recombinant antibodies that specifically 
bind to a polypeptide of the invention or to a polypeptide encoded by a nucleic acid of the 
invention. The antibody can be a monoclonal or a polyclonal antibody. The invention 
provides hybridomas comprising an antibody of the invention, e.g., an antibody that 
specifically binds to a polypeptide of the invention or to a polypeptide encoded by a nucleic 
acid of the invention. 
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The invention provides method of isolating or identifying a polypeptide 
having a xylanase activity comprising the steps of: (a) providing an antibody of the invention; 
(b) providing a sample comprising polypeptides; and (c) contacting the sample of step (b) 
with the antibody of step (a) under conditions wherein the antibody can specifically bind to 
5 the polypeptide, thereby isolating or identifying a polypeptide having a xylanase activity. 

The invention provides methods of making an anti-xylanase antibody 
comprising administering to a non-human animal a nucleic acid of the invention or a 
polypeptide of the invention or subsequences thereof in an amount sufficient to generate a 
humoral immune response, thereby making an anti-xylanase antibody. The invention 

10 provides methods of making an anti-xylanase immune comprising administering to a non- 
human animal a nucleic acid of the invention or a polypeptide of the invention or 
subsequences thereof in an amount sufficient to generate an immune response. 

The invention provides methods of producing a recombinant polypeptide 
comprising the steps of: (a) providing a nucleic acid of the invention operably linked to a 

15 promoter; and (b) expressing the nucleic acid of step (a) under conditions that allow 

expression of the polypeptide, thereby producing a recombinant polypeptide. In one aspect, 
the method can further comprise transforming a host cell with the nucleic acid of step (a) 
followed by expressing the nucleic acid of step (a), thereby producing a recombinant 
polypeptide in a transformed cell. 

20 The invention provides methods for identifying a polypeptide having a 

xylanase activity comprising the following steps: (a) providing a polypeptide of the 
invention; or a polypeptide encoded by a nucleic acid of the invention; (b) providing a 
xylanase substrate; and (c) contacting the polypeptide or a fragment or variant thereof of step 
(a) with the substrate of step (b) and detecting a decrease in the amount of substrate or an 

25 increase in the amount of a reaction product, wherein a decrease in the amount of the 

substrate or an increase in the amount of the reaction product detects a polypeptide having a 
xylanase activity. 

The invention provides methods for identifying a xylanase substrate 
comprising the following steps: (a) providing a polypeptide of the invention; or a polypeptide 

30 encoded by a nucleic acid of the invention; (b) providing a test substrate; and (c) contacting 
the polypeptide of step (a) with the test substrate of step (b) and detecting a decrease in the 
amount of substrate or an increase in the amount of reaction product, wherein a decrease in 
the amount of the substrate or an increase in the amount of a reaction product identifies the 
test substrate as a xylanase substrate. 
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The invention provides methods of determining whether a test compound 
specifically binds to a polypeptide comprising the following steps: (a) expressing a nucleic 
acid or a vector comprising the nucleic acid under conditions permissive for translation of the 
nucleic acid to a polypeptide, wherein the nucleic acid comprises a nucleic acid of the 
5 invention, or, providing a polypeptide of the invention; (b) providing a test compound; (c) 
contacting the polypeptide with the test compound; and (d) determining whether the test 
compound of step (b) specifically binds to the polypeptide. 

The invention provides methods for identifying a modulator of a xylanase 
activity comprising the following steps: (a) providing a polypeptide of the invention or a 

10 polypeptide encoded by a nucleic acid of the invention; (b) providing a test compound; (c) 
contacting the polypeptide of step (a) with the test compound of step (b) and measuring an 
activity of the xylanase, wherein a change in the xylanase activity measured in the presence 
of the test compound compared to the activity in the absence of the test compound provides a 
determination that the test compound modulates the xylanase activity. In one aspect, the 

1 5 xylanase activity can be measured by providing a xylanase substrate and detecting a decrease* 
in the amount of the substrate or an increase in the amount of a reaction product, or, an 
increase in the amount of the substrate or a decrease in the amount of a reaction product. A 
decrease in the amount of the substrate or an increase in the amount of the reaction product 
with the test compound as compared to the amount of substrate or reaction product without 

20 the test compound identifies the test compound as an activator of xylanase activity. An 
increase in the amount of the substrate or a decrease in the amount of the reaction product 
with the test compound as compared to the amount of substrate or reaction product without 
the test compound identifies the test compound as an inhibitor of xylanase activity. 

The invention provides computer systems comprising a processor and a data 

25 storage device wherein said data storage device has stored thereon a polypeptide sequence or 
a nucleic acid sequence of the invention (e.g., a polypeptide encoded by a nucleic acid of the 
invention). In one aspect, the computer system can further comprise a sequence comparison 
algorithm and a data storage device having at least one reference sequence stored thereon. In 
another aspect, the sequence comparison algorithm comprises a computer program that 

30 indicates polymorphisms. In one aspect, the computer system can further comprise an 
identifier that identifies one or more features in said sequence. The invention provides 
computer readable media having stored thereon a polypeptide sequence or a nucleic acid 
sequence of the invention. The invention provides methods for identifying a feature in a 
sequence comprising the steps of: (a) reading the sequence using a computer program which 
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identifies one or more features in a sequence, wherein the sequence comprises a polypeptide 
sequence or a nucleic acid sequence of the invention; and (b) identifying one or more features 
in the sequence with the computer program. The invention provides methods for comparing 
a first sequence to a second sequence comprising the steps of: (a) reading the first sequence 
5 and the second sequence through use of a computer program which compares sequences, 
wherein the first sequence comprises a polypeptide sequence or a nucleic acid sequence of 
the invention; and (b) determining differences between the first sequence and the second 
sequence with the computer program. The step of determining differences between the first 
sequence and the second sequence can further comprise the step of identifying 
10 polymorphisms. In one aspect, the method can further comprise an identifier that identifies 
one or more features in a sequence. In another aspect, the method can comprise reading the 
first sequence using a computer program and identifying one or more features in the 
sequence. 

The invention provides methods for isolating or recovering a nucleic acid 

1 5 encoding a polypeptide having a xylanase activity from an environmental sample comprising 
the steps of: (a) providing an amplification primer sequence pair for amplifying a nucleic acid 
encoding a polypeptide having a xylanase activity, wherein the primer pair is capable of 
amplifying a nucleic acid of the invention; (b) isolating a nucleic acid from the environmental 
sample or treating the environmental sample such that nucleic acid in the sample is accessible 

20 for hybridization to the amplification primer pair; and, (c) combining the nucleic acid of step 
(b) with the amplification primer pair of step (a) and amplifying nucleic acid from the 
environmental sample, thereby isolating or recovering a nucleic acid encoding a polypeptide 
having a xylanase activity from an environmental sample. One or each member of the 
amplification primer sequence pair can comprise an oligonucleotide comprising at least about 

25 10 to 50 consecutive bases of a sequence of the invention. In one aspect, the amplification 
primer sequence pair is an amplification pair of the invention. 

The invention provides methods for isolating or recovering a nucleic acid 
encoding a polypeptide having a xylanase activity from an environmental sample comprising 
the steps of: (a) providing a polynucleotide probe comprising a nucleic acid of the invention 

30 or a subsequence thereof; (b) isolating a nucleic acid from the environmental sample or 
treating the environmental sample such that nucleic acid in the sample is accessible for 
hybridization to a polynucleotide probe of step (a); (c) combining the isolated nucleic acid or 
the treated environmental sample of step (b) with the polynucleotide probe of step (a); and (d) 
isolating a nucleic acid that specifically hybridizes with the polynucleotide probe of step (a), 
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thereby isolating or recovering a nucleic acid encoding a polypeptide having a xylanase 
activity from an environmental sample. The environmental sample can comprise a water 
sample, a liquid sample, a soil sample, an air sample or a biological sample. In one aspect, 
the biological sample can be derived from a bacterial cell, a protozoan cell, an insect cell, a 
yeast cell, a plant cell, a fungal cell or a mammalian cell. 

The invention provides methods of generating a variant of a nucleic acid 
encoding a polypeptide having a xylanase activity comprising the steps of: (a) providing a 
template nucleic acid comprising a nucleic acid of the invention; and (b) modifying, deleting 
or adding one or more nucleotides in the template sequence, or a combination thereof, to 
generate a variant of the template nucleic acid. In one aspect, the method can further 
comprise expressing the variant nucleic acid to generate a variant xylanase polypeptide. The 
modifications, additions or deletions can be introduced by a method comprising error-prone 
PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR 
mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, 
exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly (e.g., 
GeneReassembly™, see, e.g., U.S. Patent No. 6,537,776), gene site saturated mutagenesis 
(GSSM™), synthetic ligation reassembly (SLR) or a combination thereof. In another aspect, 
the modifications, additions or deletions are introduced by a method comprising 
recombination, recursive sequence recombination, phosphothioate-modified DNA 
mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, point 
mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical mutagenesis, 
radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction- 
purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic 
acid multimer creation and a combination thereof. 

In one aspect, the method can be iteratively repeated until a xylanase having 
an altered or different activity or an altered or different stability from that of a polypeptide 
encoded by the template nucleic acid is produced. In one aspect, the variant xylanase 
polypeptide is thermotolerant, and retains some activity after being exposed to an elevated 
temperature. In another aspect, the variant xylanase polypeptide has increased glycosylation 
as compared to the xylanase encoded by a template nucleic acid. Alternatively, the variant 
xylanase polypeptide has a xylanase activity under a high temperature, wherein the xylanase 
encoded by the template nucleic acid is not active under the high temperature. In one aspect, 
the method can be iteratively repeated until a xylanase coding sequence having an altered 
codon usage from that of the template nucleic acid is produced. In another aspect, the 
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method can be iteratively repeated until a xylanase gene having higher or lower level of 
message expression or stability from that of the template nucleic acid is produced. 

In one aspect, the invention provides isolated or recombinant nucleic acids 
comprising a sequence as set forth in SEQ ED NO: 189, wherein SEQ ID NO: 189 contains 
5 one or more of the following mutations: the nucleotides at positions 22 to 24 are TTC, the 
nucleotides at positions 31 to 33 are CAC, the nucleotides at positions 34 to 36 are TTG, the 
nucleotides at positions 49 to 5 1 are ATA, the nucleotides at positions 31 to 33 are CAT, the 
nucleotides at positions 67 to 69 are ACG, the nucleotides at positions 178 to 180 are CAC, 
the nucleotides at positions 190 to 192 are TGT, the nucleotides at positions 190 to 192 are 

10 GTA, the nucleotides at positions 190 to 192 are GTT, the nucleotides at positions 193 to 195 
are GTG, the nucleotides at positions 202 to 204 are GCT, the nucleotides at positions 235 to 
237 are CCA, or the nucleotides at positions 235 to 237 are CCC. In one aspect, the 
invention provides methods for making a nucleic acid comprising this sequence, wherein the 
mutations in SEQ ID NO: 189 are obtained by gene site saturated mutagenesis (GSSM™). 

15 In one aspect, the invention provides isolated or recombinant nucleic acids 

comprising SEQ ID NO: 190, wherein SEQ ID NO: 190 contains one or more of the 
following mutations: the aspartic acid at amino acid position 8 is phenylalanine, the 
glutarnine at amino acid position 1 1 is histidine, the asparagine at amino acid position 12 is 
leucine, the glycine at amino acid position 17 is isoleucine, the threonine at amino acid 

20 position 23 is threonine encoded by a codon other than the wild type codon, the glycine at 
amino acid position 60 is histidine, the proline at amino acid position 64 is cysteine, the 
proline at amino acid position 64 is valine, the serine at amino acid position 65 is valine, the 
glycine at amino acid position 68 is isoleucine, the glycine at amino acid position 68 is 
alanine, or the valine at amino acid position 79 is proline. 

25 The invention provides methods for modifying codons in a nucleic acid 

encoding a polypeptide having a xylanase activity to increase its expression in a host cell, the 
method comprising the following steps: (a) providing a nucleic acid of the invention 
encoding a polypeptide having a xylanase activity; and, (b) identifying a non-preferred or a 
less preferred codon in the nucleic acid of step (a) and replacing it with a preferred or 

30 neutrally used codon encoding the same amino acid as the replaced codon, wherein a 

preferred codon is a codon over-represented in coding sequences in genes in the host cell and 
a non-preferred or less preferred codon is a codon under-represented in coding sequences in 
genes in the host cell, thereby modifying the nucleic acid to increase its expression in a host 
cell. 
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The invention provides methods for modifying codons in a nucleic acid 
encoding a polypeptide having a xylanase activity; the method comprising the following 
steps: (a) providing a nucleic acid of the invention; and, (b) identifying a codon in the nucleic 
acid of step (a) and replacing it with a different codon encoding the same amino acid as the 
5 replaced codon, thereby modifying codons in a nucleic acid encoding a xylanase. 

The invention provides methods for modifying codons in a nucleic acid 
encoding a polypeptide having a xylanase activity to increase its expression in a host cell, the 
method comprising the following steps: (a) providing a nucleic acid of the invention 
encoding a xylanase polypeptide; and, (b) identifying a non-preferred or a less preferred 

10 codon in the nucleic acid of step (a) and replacing it with a preferred or neutrally used codon 
encoding the same amino acid as the replaced codon, wherein a preferred codon is a codon 
over-represented in coding sequences in genes in the host cell and a non-preferred or less 
preferred codon is a codon under-represented in coding sequences in genes in the host cell, 
thereby modifying the nucleic acid to increase its expression in a host cell. 

1 5 The invention provides methods for modifying a codon in a nucleic acid 

encoding a polypeptide having a xylanase activity to decrease its expression in a host cell, the 
method comprising the following steps: (a) providing a nucleic acid of the invention; and (b) 
identifying at least one preferred codon in the nucleic acid of step (a) and replacing it with a 
non-preferred or less preferred codon encoding the same amino acid as the replaced codon, 

20 wherein a preferred codon is a codon over-represented in coding sequences in genes in a host 
cell and a non-preferred or less preferred codon is a codon under-represented in coding 
sequences in genes in the host cell, thereby modifying the nucleic acid to decrease its 
expression in a host cell. In one aspect, the host cell can be a bacterial cell, a fungal cell, an 
insect cell, a yeast cell, a plant cell or a mammalian cell. 

25 The invention provides methods for producing a library of nucleic acids 

encoding a plurality of modified xylanase active sites or substrate binding sites, wherein the 
modified active sites or substrate binding sites are derived from a first nucleic acid 
comprising a sequence encoding a first active site or a first substrate binding site the method 
comprising the following steps: (a) providing a first nucleic acid encoding a first active site 

30 or first substrate binding site, wherein the first nucleic acid sequence comprises a sequence 
that hybridizes under stringent conditions to a nucleic acid of the invention, and the nucleic 
acid encodes a xylanase active site or a xylanase substrate binding site; (b) providing a set of 
mutagenic oligonucleotides that encode naturally-occurring amino acid variants at a plurality 
of targeted codons in the first nucleic acid; and, (c) using the set of mutagenic 
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oligonucleotides to generate a set of active site-encoding or substrate binding site-encoding 
variant nucleic acids encoding a range of amino acid variations at each amino acid codon that 
was mutagenized, thereby producing a library of nucleic acids encoding a plurality of 
modified xylanase active sites or substrate binding sites. In one aspect, the method comprises 
5 mutagenizing the first nucleic acid of step (a) by a method comprising an optimized directed 
evolution system, gene site-saturation mutagenesis (GSSM™), synthetic ligation reassembly 
(SLR), error-prone PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, 
sexual PCR mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble 
mutagenesis, exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly 

10 (GeneReassembly™, U.S. Patent No. 6,537,776), gene site saturated mutagenesis (GSSM™), 
synthetic ligation reassembly (SLR) and a combination thereof. In another aspect, the 
method comprises mutagenizing the first nucleic acid of step (a) or variants by a method 
comprising recombination, recursive sequence recombination, phosphothioate-modified DNA 
mutagenesis, uracil-c^ntaining template mutagenesis, gapped duplex mutagenesis, point 

15 mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical mutagenesis, 
radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction- , 
purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic 
acid multimer creation and a combination thereof. 

The invention provides methods for making a small molecule comprising the 

20 following steps: (a) providing a plurality of biosynthetic enzymes capable of synthesizing or 
modifying a small molecule, wherein one of the enzymes comprises a xylanase enzyme 
encoded by a nucleic acid of the invention; (b) providing a substrate for at least one of the 
enzymes of step (a); and (c) reacting the substrate of step (b) with the enzymes under 
conditions that facilitate a plurality of biocatalytic reactions to generate a small molecule by a 

25 series of biocatalytic reactions. The invention provides methods for modifying a small 
molecule comprising the following steps: (a) providing a xylanase enzyme, wherein the 
enzyme comprises a polypeptide of the invention, or, a polypeptide encoded by a nucleic acid 
of the invention, or a subsequence thereof; (b) providing a small molecule; and (c) reacting 
the enzyme of step (a) with the small molecule of step (b) under conditions that facilitate an 

30 enzymatic reaction catalyzed by the xylanase enzyme, thereby modifying a small molecule 
by a xylanase enzymatic reaction. In one aspect, the method can comprise a plurality of 
small molecule substrates for the enzyme of step (a), thereby generating a library of modified 
small molecules produced by at least one enzymatic reaction catalyzed by the xylanase 
enzyme. In one aspect, the method can comprise a plurality of additional enzymes under 
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conditions that facilitate a plurality of biocatalytic reactions by the enzymes to form a library 
of modified small molecules produced by the plurality of enzymatic reactions. In another 
aspect, the method can further comprise the step of testing the library to determine if a 
particular modified small molecule that exhibits a desired activity is present within the 
5 library. The step of testing the library can further comprise the steps of systematically 

eliminating all but one of the biocatalytic reactions used to produce a portion of the plurality 
of the modified small molecules within the library by testing the portion of the modified 
small molecule for the presence or absence of the particular modified small molecule with a 
desired activity, and identifying at least one specific biocatalytic reaction that produces the 

10 particular modified small molecule of desired activity. 

The invention provides methods for determining a functional fragment of a 
xylanase enzyme comprising the steps of: (a) providing a xylanase enzyme, wherein the 
enzyme comprises a polypeptide of the invention, or a polypeptide encoded by a nucleic acid 
of the invention, or a subsequence thereof; and (b) deleting a plurality of amino acid residues 

15 from the sequence of step (a) and testing the remaining subsequence for a xylanase activity, 
thereby determining a functional fragment of a xylanase enzyme. In one aspect, the xylanase 
activity is measured by providing a xylanase substrate and detecting a decrease in the amount • 
of the substrate or an increase in the amount of a reaction product 

The invention provides methods for whole cell engineering of new or 

20 modified phenotypes by using real-time metabolic flux analysis, the method comprising the 
following steps: (a) making a modified cell by modifying the genetic composition of a cell, 
wherein the genetic composition is modified by addition to the cell of a nucleic acid of the 
invention; (b) culturing the modified cell to generate a plurality of modified cells; (c) 
measuring at least one metabolic parameter of the cell by monitoring the cell culture of step 

25 (b) in real time; and, (d) analyzing the data of step (c) to determine if the measured parameter 
differs from a comparable measurement in an unmodified cell under similar conditions, 
thereby identifying an engineered phenotype in the cell using real-time metabolic flux 
analysis. In one aspect, the genetic composition of the cell can be modified by a method 
comprising deletion of a sequence or modification of a sequence in the cell, or, knocking out 

30 the expression of a gene. In one aspect, the method can further comprise selecting a cell 
comprising a newly engineered phenotype. In another aspect, the method can comprise 
culturing the selected cell, thereby generating a new cell strain comprising a newly 
engineered phenotype. 



28 



WO 03/106654 



PCTAJS03/19153 



The invention provides methods of increasing thermotolerance or 
thermostability of a xylanase polypeptide, the method comprising glycosylating a xylanase 
polypeptide, wherein the polypeptide comprises at least thirty contiguous amino acids of a 
polypeptide of the invention; or a polypeptide encoded by a nucleic acid sequence of the 
5 invention, thereby increasing the thermotolerance or thermostability of the xylanase 
polypeptide. In one aspect, the xylanase specific activity can be thermostable or 
thennotolerant at a temperature in the range from greater than about 37°C to about 95°C. 

The invention provides methods for overexpressing a recombinant xylanase 
polypeptide in a cell comprising expressing a vector comprising a nucleic acid comprising a 

10 nucleic acid of the invention or a nucleic acid sequence of the invention, wherein the 

sequence identities are determined by analysis with a sequence comparison algorithm or by 
visual inspection, wherein overexpression is effected by use of a high activity promoter, a 
dicistronic vector or by gene amplification of the vector. 

The invention provides methods of making a transgenic plant comprising the 

15 following steps: (a) introducing a heterologous nucleic acid sequence into the cell, wherein 
the heterologous nucleic sequence comprises a nucleic acid sequence of the invention, 
thereby producing a transformed plant cell; and (b) producing a transgenic plant from the 
transformed cell. In one aspect, the step (a) can further comprise introducing the 
heterologous nucleic acid sequence by electroporation or microinjection of plant cell 

20 protoplasts. In another aspect, the step (a) can further comprise introducing the heterologous '- 
nucleic acid sequence directly to plant tissue by DNA particle bombardment. Alternatively, 
the step (a) can further comprise introducing the heterologous nucleic acid sequence into the 
plant cell DNA using an Agrobacterium tumefaciens host. In one aspect, the plant cell can be 
a potato, corn, rice, wheat, tobacco, or barley cell. 

25 The invention provides methods of expressing a heterologous nucleic acid 

sequence in a plant cell comprising the following steps: (a) transforming the plant cell with a 
heterologous nucleic acid sequence operably linked to a promoter, wherein the heterologous 
nucleic sequence comprises a nucleic acid of the invention; (b) growing the plant under 
conditions wherein the heterologous nucleic acids sequence is expressed in the plant cell. 

30 The invention provides methods of expressing a heterologous nucleic acid sequence in a plant 
cell comprising the following steps: (a) transforming the plant cell with a heterologous 
nucleic acid sequence operably linked to a promoter, wherein the heterologous nucleic 
sequence comprises a sequence of the invention; (b) growing the plant under conditions 
wherein the heterologous nucleic acids sequence is expressed in the plant cell. 
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The invention provides methods for hydrolyzing, breaking up or disrupting a 
xylan-comprising composition comprising the following steps: (a) providing a polypeptide of 
the invention having a xylanase activity, or a polypeptide encoded by a nucleic acid of the 
invention; (b) providing a composition comprising a xylan; and (c) contacting the 
5 polypeptide of step (a) with the composition of step (b) under conditions wherein the 

xylanase hydrolyzes, breaks up or disrupts the xylan-comprising composition. In one aspect, 
the composition comprises a plant cell, a bacterial cell, a yeast cell, an insect cell, or an 
animal cell. Thus, the composition can comprise any plant or plant part, any xylan- 
containing food or feed, a waste product and the like. The invention provides methods for 

10 liquefying or removing a xylan-comprising composition comprising the following steps: (a) 
providing a polypeptide of the invention having a xylanase activity, or a polypeptide encoded 
by a nucleic acid of the invention; (b) providing a composition comprising a xylan; and (c) 
contacting the polypeptide of step (a) with the composition of step (b) under conditions 
wherein the xylanase removes, softens or liquefies the xylan-comprising composition. 

15 The invention provides detergent compositions comprising a polypeptide of 

the invention, or a polypeptide encoded by a nucleic acid of the invention, wherein the 
polypeptide has a xylanase activity. The xylanase can be a nonsurface-active xylanase or a 
surface-active xylanase. The xylanase can be formulated in a non-aqueous liquid 
composition, a cast solid, a granular form, a particulate form, a compressed tablet, a gel form, 

20 a paste or a slurry form. The invention provides methods for washing an object comprising 
the following steps: (a) providing a composition comprising a polypeptide of the invention 
having a xylanase activity, or a polypeptide encoded by a nucleic acid of the invention; (b) 
providing an object; and (c) contacting the polypeptide of step (a) and the object of step (b) 
under conditions wherein the composition can wash the object. 

25 The invention provides textiles or fabrics, including, e.g., threads, comprising 

a polypeptide of the invention, or a polypeptide encoded by a nucleic acid of the invention. 
In one aspect, the textiles or fabrics comprise xylan-containing fibers. The invention 
provides methods for treating a textile or fabric (e.g., removing a stain from a composition) 
comprising the following steps: (a) providing a composition comprising a polypeptide of the 

30 invention having a xylanase activity, or a polypeptide encoded by a nucleic acid of the 
invention; (b) providing a textile or fabric comprising a xylan; and (c) contacting the 
polypeptide of step (a) and the composition of step (b) under conditions wherein the xylanase 
can treat the textile or fabric (e.g., remove the stain). The invention provides methods for 
improving the finish of a fabric comprising the following steps: (a) providing a composition 
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comprising a polypeptide of the invention having a xylanase activity, or a polypeptide 
encoded by a nucleic acid of the invention; (b) providing a fabric; and (c) contacting the 
polypeptide of step (a) and the fabric of step (b) tinder conditions wherein the polypeptide 
can treat the fabric thereby improving the finish of the fabric. In one aspect, the fabric is a 
5 wool or a silk. 

The invention provides feeds or foods comprising a polypeptide of the 
invention, or a polypeptide encoded by a nucleic acid of the invention. The invention 
provides methods for hydrolyzing xylans in a feed or a food prior to consumption by an 
animal comprising the following steps: (a) obtaining a feed material comprising a xylanase of 

10 the invention, or a xylanase encoded by a nucleic acid of the invention; and (b) adding the 
polypeptide of step (a) to the feed or food material in an amount sufficient for a sufficient 
time period to cause hydrolysis of the xylan and formation of a treated food or feed, thereby 
hydrolyzing the xylans in the food or the feed prior to consumption by the animal. In one 
aspect, the invention provides methods for hydrolyzing xylans in a feed or a food after 

15 consumption by an animal comprising the following steps: (a) obtaining a feed material 
comprising a xylanase of the invention, or a xylanase encoded by a nucleic acid of the 
invention; (b) adding the polypeptide of step (a) to the feed or food material; and (c) 
administering the feed or food material to the animal, wherein after consumption, the 
xylanase causes hydrolysis of xylans in the feed or food in the digestive tract of the animal. 

20 The food or the feed can be, e.g., a cereal, a grain, a corn and the like. 

The invention provides food or nutritional supplements for an animal 
comprising a polypeptide of the invention, e.g., a polypeptide encoded by the nucleic acid of 
the invention. In one aspect, the polypeptide in the food or nutritional supplement can be 
glycosylated. The invention provides edible enzyme delivery matrices comprising a 

25 polypeptide of the invention, e.g., a polypeptide encoded by the nucleic acid of the invention. 
In one aspect, the delivery matrix comprises a pellet. In one aspect, the polypeptide can be 
glycosylated. In one aspect, the xylanase activity is thermotolerant. In another aspect, the 
xylanase activity is thermostable. 

The invention provides a food, a feed or a nutritional supplement comprising a 

30 polypeptide of the invention. The invention provides methods for utilizing a xylanase as a 
nutritional supplement in an animal diet, the method comprising: preparing a nutritional 
supplement containing a xylanase enzyme comprising at least thirty contiguous amino acids 
of a polypeptide of the invention; and administering the nutritional supplement to an animal 
to increase utilization of a xylan contained in a feed or a food ingested by the animal. The 
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animal can be a human, a ruminant or a monogastric animal. The xylanase enzyme can be 
prepared by expression of a polynucleotide encoding the xylanase in an organism selected 
from the group consisting of a bacterium, a yeast, a plant, an insect, a fungus and an animal. 
The organism can be selected from the group consisting of an S. pombe, S. cerevisiae, Pichia 

5 pastoris, Pseudomonas sp., E. coli, Streptomyces sp., Bacillus sp. and Lactobacillus sp. 

The invention provides edible enzyme delivery matrix comprising a 
thermostable recombinant xylanase enzyme, e.g., a polypeptide of the invention. The 
invention provides methods for delivering a xylanase supplement to an animal, the method 
comprising: preparing an edible enzyme delivery matrix in the form of pellets comprising a 

10 granulate edible carrier and a thermostable recombinant xylanase enzyme, wherein the pellets 
readily disperse the xylanase enzyme contained therein into aqueous media, and 
administering the edible enzyme delivery matrix to the animal. The recombinant xylanase 
enzyme can comprise a polypeptide of the invention. The granulate edible carrier can 
comprise a carrier selected from the group consisting of a grain germ, a grain germ that is 

1 5 spent of oil, a hay, an alfalfa, a timothy, a soy hull, a sunflower seed meal and a wheat midd. 
The edible carrier can comprise grain germ that is spent of oil. The xylanase enzyme can be • 
glycosylated to provide thermostability at pelletizing conditions. The delivery matrix can be 
formed by pelletizing a mixture comprising a grain germ and a xylanase. The pelletizing 
conditions can include application of steam. The pelletizing conditions can comprise 

20 application of a temperature in excess of about 80°C for about 5 minutes and the enzyme 
retains a specific activity of at least 350 to about 900 units per milligram of enzyme. 

The invention provides methods for improving texture and flavor of a dairy 
product comprising the following steps: (a) providing a polypeptide of the invention having a 
xylanase activity, or a xylanase encoded by a nucleic acid of the invention; (b) providing a 

25 dairy product; and (c) contacting the polypeptide of step (a) and the dairy product of step (b) 
under conditions wherein the xylanase can improve the texture or flavor of the dairy product. 
In one aspect, the dairy product comprises a cheese or a yogurt. The invention provides dairy 
products comprising a xylanase of the invention, or is encoded by a nucleic acid of the 
invention. 

30 The invention provides methods for improving the extraction of oil from an 

oil-rich plant material comprising the following steps: (a) providing a polypeptide of the 
invention having a xylanase activity, or a xylanase encoded by a nucleic acid of the 
invention; (b) providing an oil-rich plant material; and (c) contacting the polypeptide of step 
(a) and the oil-rich plant material. In one aspect, the oil-rich plant material comprises an oil- 
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rich seed. The oil can be a soybean oil, an olive oil, a rapeseed (canola) oil or a sunflower 
oil. 

The invention provides methods for preparing a fruit or vegetable juice, syrup, 
puree or extract comprising the following steps: (a) providing a polypeptide of the invention 
5 having a xylanase activity, or a xylanase encoded by a nucleic acid of the invention; (b) 
providing a composition or a liquid comprising a fruit or vegetable material; and (c) 
contacting the polypeptide of step (a) and the composition, thereby preparing the fruit or 
vegetable juice, syrup, puree or extract. 

The invention provides papers or paper products or paper pulp comprising a 

10 xylanase of the invention, or a polypeptide encoded by a nucleic acid of the invention. The 
invention provides methods for treating a paper or a paper or wood pulp comprising the 
following steps: (a) providing a polypeptide of the invention having a xylanase activity, or a 
xylanase encoded by a nucleic acid of the invention; (b) providing a composition comprising 
a paper or a paper or wood pulp; and (c) contacting the polypeptide of step (a) and the 

15 composition of step (b) under conditions wherein the xylanase can treat the paper or paper or 
wood pulp. In one aspect, the pharmaceutical composition acts as a digestive aid or an anti- 
microbial (e.g., against Salmonella). In one aspect, the treatment is prophylactic. In one 
aspect, the invention provides oral care products comprising a polypeptide of the invention 
having a xylanase activity, or a xylanase encoded by a nucleic acid of the invention. The oral % 

20 care product can comprise a toothpaste, a dental cream, a gel or a tooth powder, an odontic, a 
mouth wash, a pre- or post brushing rinse formulation, a chewing gum, a lozenge or a candy. 
The invention provides contact lens cleaning compositions comprising a polypeptide of the 
invention having a xylanase activity, or a xylanase encoded by a nucleic acid of the 
invention. 

25 hi one aspect, the invention provides methods for eliminating or protecting 

animals from a microorganism comprising a xylan comprising administering a polypeptide of 
the invention. The microorganism can be a bacterium comprising a xylan, e.g., Salmonella. 

The invention provides an isolated nucleic acid having a sequence as set forth 
in SEQ ID NO.:189 and variants thereof having at least 50% sequence identity to SEQ ID 

30 NO. : 1 89 and encoding polypeptides having xylanase activity. In one aspect, the polypeptide 
has a xylanase activity, e.g., a thermostable xylanase activity. 

The invention provides isolated or recombinant nucleic acids comprising SEQ 
ID NO: 189, wherein SEQ ID NO: 189 comprises one or more or all of the following sequence 
variations: the nucleotides at positions 22 to 24 are TTC, the nucleotides at positions 22 to 24 
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are TTT, the nucleotides at positions 31 to 33 are CAC, the nucleotides at positions 31 to 33 
are CAT, the nucleotides at positions 34 to 36 are TTG, the nucleotides at positions 34 to 36 
are TTA, the nucleotides at positions 34 to 36 are CTC, the nucleotides at positions 34 to 36 
are CTT, the nucleotides at positions 34 to 36 are CTA, the nucleotides at positions 34 to 36 
5 are CTG, the nucleotides at positions 49 to 51 are ATA, the nucleotides at positions 49 to 5 1 
are ATT, the nucleotides at positions 49 to 51 are ATC, the nucleotides at positions 178 to 
180 are CAC, the nucleotides at positions 178 to 180 are CAT, the nucleotides at positions 
190 to 192 are TGT, the nucleotides at positions 190 to 192 are TGC, the nucleotides at 
positions 190 to 192 are GTA, the nucleotides at positions 190 to 192 are GTT, the 

10 nucleotides at positions 190 to 192 are GTC, the nucleotides at positions 190 to 192 are GTG, 
the nucleotides at positions 193 to 195 are GTG, the nucleotides at positions 193 to 195 are 
GTC, the nucleotides at positions 193 to 195 are GTA, the nucleotides at positions 193 to 195 
are GTT, the nucleotides at positions 202 to 204 are ATA, the nucleotides at positions 202 to 
204 are ATT, the nucleotides at positions 202 to 204 are ATC, the nucleotides at positions 

1 5 202 to 204 are GCT, the nucleotides at positions 202 to 204 are GCG, the nucleotides at 
positions 202 to 204 are GCC, the nucleotides at positions 202 to 204 are GCA, the 
nucleotides at positions 235 to 237 are CCA, the nucleotides at positions 235 to 237 are CCC, 
or the nucleotides at positions 235 to 237 are CCG. 

The invention provides isolated or recombinant polypeptides comprising an 

20 amino acid sequence comprising SEQ ID NO:190, wherein SEQ ID NO:190 comprises one • 
or more or all of the following sequence variations: the aspartic acid at amino acid position 8 
is phenylalanine, the glutamine at amino acid position 1 1 is histidine, the asparagine at amino 
acid position 12 is leucine, the glycine at amino acid position 17 is isoleucine, the threonine 
at amino acid position 23 is threonine encoded by a codon other than the wild type codon, the 

25 glycine at amino acid position 60 is histidine, the proline at amino acid position 64 is 

cysteine, the proline at amino acid position 64 is valine, the serine at amino acid position 65 
is valine, the glycine at amino acid position 68 is isoleucine, the glycine at amino acid 
position 68 is alanine, or the serine at amino acid position 79 is proline. In one aspect, the 
polypeptide has a xylanase activity, e.g., a thermostable xylanase activity. 

30 The invention provides isolated or recombinant nucleic acids comprising SEQ 

ID NO: 189, wherein SEQ ED NO: 189 comprises one or more or all sequence variations set 
forth in Table 1 or Table 2. The invention provides isolated or recombinant polypeptides 
encoded by nucleic acids comprising SEQ ID NO: 189, wherein SEQ ID NO:189 comprises 
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one or more or all sequence variations set forth in Table 1 or Table 2. In one aspect, the 
polypeptide has a xylanase activity, e.g., a thermostable xylanase activity. 

The invention provides isolated or recombinant nucleic acids comprising SEQ 
ID NO:379, wherein SEQ ID NO:379 comprises one or more or all of the following sequence 
5 variations: the nucleotides at positions 22 to 24 are TTC, the nucleotides at positions 3 1 to 33 
are CAC, the nucleotides at positions 49 to 51 are ATA, the nucleotides at positions 178 to 
180 are CAC, the nucleotides at positions 193 to 195 are GTG, the nucleotides at positions 
202 to 204 are GCT. 

The invention provides isolated or recombinant polypeptides comprising SEQ 
10 ID NO:380, wherein SEQ ID NO:380 comprises one or more or all of the following sequence 
variations: D8F, Q11H, G17I, G60H, S65V and/or G68A. In one aspect, the polypeptide has 
a xylanase activity, e.g., a thermostable xylanase activity. 

The isolated or recombinant nucleic acids of the invention are also referred to 
as "Group A nucleic acid sequences". The invention provides an isolated nucleic acid 
15 including at least 10 consecutive bases of a sequence as set forth in Group A nucleic acid 
sequences, sequences substantially identical thereto and the sequences complementary 
thereto. 

The isolated or recombinant polypeptides of the invention, which include 
functional fragments of the exemplary sequences of the invention, are also referred to as 

20 "Group B amino acid sequences'*. Another aspect of the invention is an isolated or 

recombinant nucleic acid encoding a polypeptide having at least 10 consecutive amino acids 
of a sequence as set forth in Group B amino acid sequences and sequences substantially 
identical thereto. In yet another aspect, the invention provides a purified polypeptide having 
a sequence as set forth in Group B amino acid sequences and sequences substantially 

25 identical thereto. Another aspect of the invention is an isolated or purified antibody that 
specifically binds to a polypeptide having a sequence as set forth in Group B amino acid 
sequences and sequences substantially identical thereto. 

Another aspect of the invention is an isolated or purified antibody or binding 
fragment thereof, which specifically binds to a polypeptide having at least 10 consecutive 

30 amino acids of one of the polypeptides of Group B amino acid sequences and sequences 
substantially identical thereto. 

Another aspect of the invention is a method of making a polypeptide having a 
sequence as set forth in Group B amino acid sequences and sequences substantially identical 
thereto. The method includes introducing a nucleic acid encoding the polypeptide into a host 
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cell, wherein the nucleic acid is operably linked to a promoter and culturing the host cell 
under conditions that allow expression of the nucleic acid. Another aspect of the invention is 
a method of making a polypeptide having at least 10 amino acids of a sequence as set forth in 
Group B amino acid sequences and sequences substantially identical thereto. The method 

5 includes introducing a nucleic acid encoding the polypeptide into a host cell, wherein the 
nucleic acid is operably linked to a promoter and culturing the host cell under conditions that 
allow expression of the nucleic acid, thereby producing the polypeptide. 

Another aspect of the invention is a method of generating a variant including 
obtaining a nucleic acid having a sequence as set forth in Group A nucleic acid sequences, 

10 sequences substantially identical thereto, sequences complementary to the sequences of 

Group A nucleic acid sequences, fragments comprising at least 30 consecutive nucleotides of 
the foregoing sequences and changing one or more nucleotides in the sequence to another 
nucleotide, deleting one or more nucleotides in the sequence, or adding one or more 
nucleotides to the sequence. 

15 Another aspect of the invention is a computer readable medium having stored 

thereon a sequence as set forth in Group A nucleic acid sequences and sequences 
substantially identical thereto, or a polypeptide sequence as set forth in Group B amino acid 
sequences and sequences substantially identical thereto. 

Another aspect of the invention is a computer system including a processor 

20 and a data storage device wherein the data storage device has stored thereon a sequence as set 
forth in Group A nucleic acid sequences and sequences substantially identical thereto, or a 
polypeptide having a sequence as set forth in Group B amino acid sequences and sequences 
substantially identical thereto. 

Another aspect of the invention is a method for comparing a first sequence to a 

25 reference sequence wherein the first sequence is a nucleic acid having a sequence as set forth 
in Group A nucleic acid sequences and sequences substantially identical thereto, or a 
polypeptide code of Group B amino acid sequences and sequences substantially identical 
thereto. The method includes reading the first sequence and the reference sequence through 
use of a computer program that compares sequences; and determining differences between 

30 the first sequence and the reference sequence with the computer program. 

Another aspect of the invention is a method for identifying a feature in a 
sequence as set forth in Group A nucleic acid sequences and sequences substantially identical 
thereto, or a polypeptide having a sequence as set forth in Group B amino acid sequences and 
sequences substantially identical thereto, including reading the sequence through the use of a 
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computer program which identifies features in sequences; and identifying features in the 
sequence with the computer program. 

Yet another aspect of the invention is a method of catalyzing the breakdown of 
xylan or a derivative thereof, comprising the step of contacting a sample containing xylan or 
5 the derivative thereof with a polypeptide of Group B amino acid sequences and sequences 
substantially identical thereto under conditions which facilitate the breakdown of the xylan. 

Another aspect of the invention is an assay for identifying fragments or 
variants of Group B amino acid sequences and sequences substantially identical thereto, 
which retain the enzymatic function of the polypeptides of Group B amino acid sequences 
1 0 and sequences substantially identical thereto. The assay includes contacting the polypeptide 
of Group B amino acid sequences, sequences substantially identical thereto, or polypeptide 
fragment or variant with a substrate molecule under conditions which allow the polypeptide 
fragment or variant to function and detecting either a decrease in the level of substrate or an 
increase in the level of the specific reaction product of the reaction between the polypeptide 
1 5 and substrate thereby identifying a fragment or variant of such sequences. 

Another aspect of the invention is a nucleic acid probe of an oligonucleotide 
from about 10 to 50 nucleotides in length and having a segment of at least 10 contiguous 
nucleotides that is at least 50% complementary to a nucleic acid target region of a nucleic 
acid sequence selected from the group consisting of Group A nucleic acid sequences; and 
20 which hybridizes to the nucleic acid target region under moderate to highly stringent 
conditions to form a detectable target:probe duplex. 

Another aspect of the invention is a polynucleotide probe for isolation or 
identification of xylanase genes having a sequence which is the same as, or fully 
complementary to at least a fragment of one of Group A nucleic acid sequences. 
25 In still another aspect, the invention provides a protein preparation comprising 

a polypeptide having an amino acid sequence selected from Group B amino acid sequences 
and sequences substantially identical thereto wherein the protein preparation is a liquid. 

Still another aspect of the invention provides a protein preparation comprising 
a polypeptide having an amino acid sequence selected from Group B amino acid sequences 
30 and sequences substantially identical thereto wherein the polypeptide is a solid. 

Yet another aspect of the invention provides a method for modifying small 
molecules, comprising the step of mixing at least one polypeptide encoded by a 
polynucleotide selected from Group A nucleic acid sequences, sequences substantially 
identical thereto and the sequences complementary thereto with at least one small molecule, 
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to produce at least one modified small molecule via at least one biocatalytic reaction, where 
the at least one polypeptide has xylanase activity. 

Another aspect of the invention is a cloning vector of a sequence that encodes 
a polypeptide having xylanase activity, said sequence being selected from Group A nucleic 
5 acid sequences, sequences substantially identical thereto and the sequences complementary 
thereto. 

Another aspect of the invention is a host cell comprising a sequence that 
encodes a polypeptide having xylanase activity, said sequence being selected from Group A 
nucleic acid sequences, sequences substantially identical thereto and the sequences 
1 0 complementary thereto . 

In yet another aspect, the invention provides an expression vector capable of 
replicating in a host cell comprising a polynucleotide having a sequence selected Group A 
nucleic acid sequences, sequences substantially identical thereto, sequences complementary 
thereto and isolated nucleic acids that hybridize to nucleic acids having any of the foregoing 
1 5 sequences under conditions of low, moderate and high stringency. 

In another aspect, the invention provides a method of dough conditioning 
comprising contacting dough with at least one polypeptide of Group B amino acid sequences 
and sequences substantially identical thereto under conditions sufficient for conditioning the 
dough. 

20 Another aspect of the invention is a method of beverage production 

comprising administration of at least one polypeptide of Group B amino acid sequences and 
sequences substantially identical thereto under conditions sufficient for decreasing the 
viscosity of wort or beer. 

The xylanases of the invention are used to break down the high molecular 

25 weight arabinoxylans in animal feed. Adding the xylanases of the invention stimulates 

growth rates by improving digestibility, which also improves the quality of the animal litter. 
Xylanase functions through the gastro-intestinal tract to reduce intestinal viscosity and 
increase diffusion of pancreatic enzymes. Additionally, the xylanases of the invention may 
be used in the treatment of endosperm cell walls of feed grains and vegetable proteins. In one 

30 aspect of the invention, the novel xylanases of the invention are administered to an animal in 
order to increase the utilization of the xylan in the food. This activity of the xylanases of the 
invention may be used to break down insoluble cell wall material, liberating nutrients in the 
cell walls, which then become available to the animal. It also changes hemicellulose to 
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nutritive sugars so that nutrients formerly trapped within the cell walls are released. 
Xylanase also produces compounds that may be a nutritive source for the ruminal microflora. 

Another aspect of the invention provides a method for utilizing xylanase as a 
nutritional supplement in the diets of animals, comprising preparation of a nutritional 

5 supplement containing a recombinant xylanase enzyme comprising at least thirty contiguous 
amino acids of Group B amino acid sequences and sequences substantially identical thereto 
and administering the nutritional supplement to an animal to increase the utilization of xylan 
contained in food ingested by the animal. 

In another aspect of the invention, a method for delivering a xylanase 

1 0 supplement to an animal is provided, where the method comprises preparing an edible 
enzyme delivery matrix in the form of pellets comprising a granulate edible carrier and a 
thermostable recombinant xylanase enzyme, wherein the particles readily disperse the 
xylanase enzyme contained therein into aqueous media, and administering the edible enzyme 
delivery matrix to the animal. The granulate edible carrier may comprise a carrier selected 

15 from the group consisting of grain germ that is spent of oil, hay, alfalfa, timothy, soy hull, 
sunflower seed meal and wheat midd. The xylanase enzyme may have an amino acid 
sequence as set forth in Group B amino acid sequences and sequences substantially identical 
thereto. 

In another aspect, the invention provides an isolated nucleic acid comprising a 
20 sequence that encodes a polypeptide having xylanase activity, wherein the sequence is 

selected from Group A nucleic acid sequences, sequences substantially identical thereto and 
the sequences complementary thereto, wherein the sequence contains a signal sequence. The 
invention also provides an isolated nucleic acid comprising a sequence that encodes a 
polypeptide having xylanase activity, wherein the sequence is selected from Group A nucleic 
25 acid sequences, sequences substantially identical thereto and the sequences complementary 
thereto, wherein the sequence contains a signal sequence from another xylanase. 
Additionally, the invention provides an isolated nucleic acid comprising a sequence that 
encodes a polypeptide having xylanase activity, wherein the sequence is selected from Group 
A nucleic acid sequences, sequences substantially identical thereto and the sequences 
30 complementary thereto wherein the sequence does not contain a signal sequence. 

Still another aspect of the invention provides an isolated nucleic acid that is a 
mutation of SEQ ID NO: 189. Yet another aspect provides an amino acid sequence that is a 
mutation of SEQ ID NO: 190. 
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The details of one or more embodiments of the invention are set forth in the 
accompanying drawings and the description below. Other features, objects, and advantages 
of the invention will be apparent from the description and drawings, and from the claims. 

All publications, patents, patent applications, GenBank sequences and ATCC 
5 deposits, cited herein are hereby expressly incorporated by reference for all purposes. 

BRIEF DESCRIPTION OF THE DRAWINGS 
The following drawings are illustrative of aspects of the invention and are not 
meant to limit the scope of the invention as encompassed by the claims. 

The patent or application file contains at least one drawing executed in color. 
10 Copies of this patent or patent application publication with color drawing(s) will be provided 
by the Office upon request and payment of the necessary fee. 

Figure 1 is a block diagram of a computer system. 

Figure 2 is a flow diagram illustrating one aspect of a process for comparing a 
new nucleotide or protein sequence with a database of sequences in order to determine the 
15 homology levels between the new sequence and the sequences in the database. 

Figure 3 is a flow diagram illustrating one aspect of a process in a computer 
for determining whether two sequences are homologous. 

Figure 4 is a flow diagram illustrating one aspect of an identifier process 300 
for detecting the presence of a feature in a sequence. 
20 Figure 5 is a graph comparing activity of the wild type sequence (SEQ ID 

NOS: 189 and 190) to the 8x mutant (SEQ ID NOS:375, 376), a combination of mutants D, 
F, H, I, S, V, X and AA in Table 1 . 

Figure 6 A illustrates the nine single site amino acid mutants of SEQ ID 
NO:378 (encoded by SEQ ID NO:377) as generated by Gene Site Saturation Mutagenesis 
25 (GSSM™) of SEQ ID NO:190 (encoded by SEQ ID NO:189), as described in detail in 
Example 5, below. 

Figure 6B illustrates the unfolding of SEQ ID NO: 190 and SEQ ID NO:378 in 
melting temperature transition midpoint (Tm) experiments as determined by DSC for each 
enzyme, as described in detail in Example 5, below. 
30 Figure 7A illustrates the pH and temperature activity profiles for the enzymes 

SEQ ID NO: 190 and SEQ ID NO:378, as described in detail in Example 5, below. 

Figure 7B illustrates the rate/temperature activity optima for the enzymes SEQ 
ID NO: 190 and SEQ ID NO:378, as described in detail in Example 5, below. 
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Figure 7C illustrates the thermal tolerance/ residual activity for the enzymes 
SEQ ID NO:190 and SEQ ED NO:378, as described in detail in Example 5, below. 

Figure 8A illustrates the GeneReassembly™ library of all possible 
combinations of the 9 GSSM™ point mutations that was constructed and screened for 
5 variants with improved thermal tolerance and activity, as described in detail in Example 5, 
below. 

Figure 8B illustrates the relative activity of the "6X-2" variant and "9X" 
variant (SEQ ID NO:378) compared to SEQ ID NO: 190 ('Nvild-type") at a temperature 
optimum and pH 6.0, as described in detail in Example 5, below. 
1 o Figure 9 A illustrates the fingerprints obtained after hydrolysis of oligoxylans 

(Xyl)3, (Xyl)4, (Xyl)5 and (Xyl)6 by the SEQ ID NO: 190 ("wild-type") and the "9X" variant 
(SEQ ID NO:378) enzymes, as described in detail in Example 5, below. 

Figure 9B illustrates the fingerprints obtained after hydrolysis of Beechwood 
xylan by the SEQ ID NO: 190 ("wild-type") and the "9X" variant (SEQ ID NO:378) 
15 enzymes, as described in detail in Example 5, below. 

Figure 1 OA is a schematic diagram illustrating the level of thermal stability 
(represented by Tm) improvement over SEQ ID NO: 190 ("wild-type") obtained by GSSM™ 
evolution, as described in detail in Example 5, below. 

Figure 10B illustrates a "fitness diagram" of enzyme improvement in the form 
20 of SEQ ID NO:378 and SEQ ID NO:380, as obtained by combining GSSM™ and 
GeneReassembly™ technologies, as described in detail in Example 5, below. 

Figure 1 1 is a schematic flow diagram of an exemplary routine screening 
protocol to determine whether a xylanase of the invention is useful in pretreating paper pulp, 
as described in detail in Example 6, below. 
25 Like reference symbols in the various drawings indicate like elements. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention relates to xylanases and polynucleotides encoding them 
and methods of making and using them. Xylanase activity of the polypeptides of the 
invention encompasses enzymes having hydrolase activity, for example, enzymes capable of 
30 hydrolyzing glycosidic linkages present in xylan, e.g., catalyzing hydrolysis of internal p-1,4- 
xylosidic linkages. The xylanases of the invention can be used to make and/or process foods, 
feeds, nutritional supplements, textiles, detergents and the like. The xylanases of the 
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invention can be used in pharmaceutical compositions and dietary aids. Xylanases of the 
invention are particularly useful in baking, animal feed, beverage and paper processes. 
Definitions 

The term "antibody" includes a peptide or polypeptide derived from, modeled 
5 after or substantially encoded by an immunoglobulin gene or immunoglobulin genes, or 
fragments thereof, capable of specifically binding an antigen or epitope, see, e.g. 
Fundamental Immunology, Third Edition, WJE. Paul, ed., Raven Press, N.Y. (1993); Wilson 
(1994) J. Immunol. Methods 175:267-273; Yarmush (1992) J. Biochem. Biophys. Methods 
25:85-97. The term antibody includes antigen-binding portions, i.e., "antigen binding sites," 
10 (e.g., fragments, subsequences, complementarity detennining regions (CDRs)) that retain 
capacity to bind antigen, including (i) a Fab fragment, a monovalent fragment consisting of 
the VL, VH, CL and CHI domains; (ii) a F(ab')2 fragment, a bivalent fragment comprising 
two Fab fragments linked by a disulfide bridge at the hinge region; (iii) a Fd fragment 
consisting of the VH and CHI domains; (iv) a Fv fragment consisting of the VL and VH 
15 domains of a single arm of an antibody, (v) a dAb fragment (Ward et al., (1989) Nature 
341:544-546), which consists of a VH domain; and (vi) an isolated complementarity 
determining region (CDR). Single chain antibodies are also included by reference in the term 
"antibody." 

The terms "array" or "microarray" or "biochip" or "chip'* as used herein is a 
20 plurality of target elements, each target element comprising a defined amount of one or more 
polypeptides (including antibodies) or nucleic acids immobilized onto a defined area of a 
substrate surface, as discussed in further detail, below. 

As used herein, the terms "computer," "computer program" and "processor" 
are used in their broadest general contexts and incorporate all such devices, as described in 
25 detail, below. A "coding sequence of" or a "sequence encodes" a particular polypeptide or 
protein, is a nucleic acid sequence which is transcribed and translated into a polypeptide or 
protein when placed under the control of appropriate regulatory sequences. 

The phrases "nucleic acid" or "nucleic acid sequence" as used herein refer to 
an oligonucleotide, nucleotide, polynucleotide, or to a fragment of any of these, to DNA or 
30 RNA of genomic or synthetic origin which may be single-stranded or double-stranded and 
may represent a sense or antisense strand, to peptide nucleic acid (PNA), or to any DNA-like 
or RNA-like material, natural or synthetic in origin. The phrases "nucleic acid" or "nucleic 
acid sequence" includes oligonucleotide, nucleotide, polynucleotide, or to a fragment of any 
of these, to DNA or RNA (e.g., mRNA, rRNA, tRNA, iRNA) of genomic or synthetic origin 
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which maybe single-stranded or double-stranded and may represent a sense or antisense 
strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material, natural or 
synthetic in origin, including, e.g., iKNA, ribonucleoproteins (e.g., e.g., double stranded 
iRNAs, e.g., iKNPs). The term encompasses nucleic acids, i.e., oligonucleotides, containing 
5 known analogues of natural nucleotides. The term also encompasses nucleic-acid-like 
structures with synthetic backbones, see e.g., Mata (1997) Toxicol. Appl. Pharmacol. 
144:189-197; Strauss-Soukup (1997) Biochemistry 36:8692-8698; Samstag (1996) Antisense 
Nucleic Acid Drug Dev 6: 153-156. "Oligonucleotide" includes either a single stranded 
polydeoxynucleotide or two complementary polydeoxynucleotide strands that may be 
10 chemically synthesized. Such synthetic oligonucleotides have no 5' phosphate and thus will 
not ligate to another oligonucleotide without adding a phosphate with an ATP in the presence 
of a kinase. A synthetic oligonucleotide can ligate to a fragment that has not been 
dephosphoryiated. 

A "coding sequence of or a "nucleotide sequence encoding" a particular 

15 polypeptide or protein, is a nucleic acid sequence which is transcribed and translated into a 
polypeptide or protein when placed under the control of appropriate regulatory sequences. 

The term "gene" means the segment of DNA involved in producing a 
polypeptide chain; it includes regions preceding and following the coding region (leader and 
trailer) as well as, where applicable, intervening sequences (introns) between individual 

20 coding segments (exons). "Operably linked" as used herein refers to a functional relationship 
between two or more nucleic acid (e.g., DNA) segments. Typically, it refers to the functional 
relationship of transcriptional regulatory sequence to a transcribed sequence. For example, a 
promoter is operably linked to a coding sequence, such as a nucleic acid of the invention, if it 
stimulates or modulates the transcription of the coding sequence in an appropriate host cell or 

25 other expression system. Generally, promoter transcriptional regulatory sequences that are 
operably linked to a transcribed sequence are physically contiguous to the transcribed 
sequence, i.e., they are cis-acting. However, some transcriptional regulatory sequences, such 
as enhancers, need not be physically contiguous or located in close proximity to the coding 
sequences whose transcription they enhance. 

30 The term "expression cassette" as used herein refers to a nucleotide sequence 

which is capable of affecting expression of a structural gene (i.e., a protein coding sequence, 
such as a xylanase of the invention) in a host compatible with such sequences. Expression 
cassettes include at least a promoter operably linked with the polypeptide coding sequence; 
and, optionally, with other sequences, e.g., transcription termination signals. Additional 
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factors necessary or helpful in effecting expression may also be used, e.g., enhancers. Thus, 
expression cassettes also include plasmids, expression vectors, recombinant viruses, any form 
of recombinant "naked DNA" vector, and the like. A "vector" comprises a nucleic acid that 
can infect, transfect, transiently or permanently transduce a cell. It will be recognized that a 
5 vector can be a naked nucleic acid, or a nucleic acid complexed with protein or lipid. The 
vector optionally comprises viral or bacterial nucleic acids and/or proteins, and/or membranes 
(e.g., a cell membrane, a viral lipid envelope, etc.). Vectors include, but are not limited to 
replicons (e.g., RNA replicons, bacteriophages) to which fragments of DNA may be attached 
and become replicated. Vectors thus include, but are not limited to RNA, autonomous self- 

1 0 replicating circular or linear DNA or RNA (e.g., plasmids, viruses, and the like, see, e.g., 
U.S. Patent No. 5,217,879), and include both the expression and non-expression plasmids. 
Where a recombinant microorganism or cell culture is described as hosting an "expression 
vector" this includes both extra-chromosomal circular and linear DNA and DNA that has 
been incorporated into the host chromosome(s). Where a vector is being maintained by a 

1 5 host cell, the vector may either be stably replicated by the cells during mitosis as an 
autonomous structure, or is incorporated within the host's genome. 

As used herein, the term "promoter" includes all sequences capable of driving 
transcription of a coding sequence in a cell, e.g., a plant cell. Thus, promoters used in the 
constructs of the invention include c/s-acting transcriptional control elements and regulatory 

20 sequences that are involved in regulating or modulating the timing and/or rate of transcription 
of a gene. For example, a promoter can be a exacting transcriptional control element, 
including an enhancer, a promoter, a transcription terminator, an origin of replication, a 
chromosomal integration sequence, 5' and 3' untranslated regions, or an intronic sequence, 
which are involved in transcriptional regulation. These cis-acting sequences typically interact 

25 with proteins or other biomolecules to carry out (turn on/off, regulate, modulate, etc.) 

transcription. "Constitutive" promoters are those that drive expression continuously under 
most environmental conditions and states of development or cell differentiation. "Inducible" 
or c< regulatable" promoters direct expression of the nucleic acid of the invention under the 
influence of environmental conditions or developmental conditions. Examples of 

30 environmental conditions that may affect transcription by inducible promoters include 
anaerobic conditions, elevated temperature, drought, or the presence of light. 

"Tissue-specific" promoters are transcriptional control elements that are only 
active in particular cells or tissues or organs, e.g., in plants or animals. Tissue-specific 
regulation maybe achieved by certain intrinsic factors that ensure that genes encoding 



WO 03/106654 PCT/US03/19153 

proteins specific to a given tissue are expressed. Such factors are known to exist in mammals 
and plants so as to allow for specific tissues to develop. 

The term "plant" includes whole plants, plant parts (e.g., leaves, stems, 
flowers, roots, etc.), plant protoplasts, seeds and plant cells and progeny of same. The class 
5 of plants which can be used in the method of the invention is generally as broad as the class 
of higher plants amenable to transformation techniques, including angiosperms 
(monocotyledonous and dicotyledonous plants), as well as gymnosperms. It includes plants 
of a variety of ploidy levels, including polyploid, diploid, haploid and hemizygous states. As 
used herein, the term "transgenic plant" includes plants or plant cells into which a 

1 0 heterologous nucleic acid sequence has been inserted, e.g., the nucleic acids and various 
recombinant constructs (e.g., expression cassettes) of the invention. 

"Plasmids" can be commercially available, publicly available on an 
unrestricted basis, or can be constructed from available plasmids in accord with published 
procedures. Equivalent plasmids to those described herein are known in the art and will be 

1 5 apparent to the ordinarily skilled artisan. 

"Amino acid" or "amino acid sequence" as used herein refer to an 
oligopeptide, peptide, polypeptide, or protein sequence, or to a fragment, portion, or subunit 
of any of these and to naturally occurring or synthetic molecules. 

"Amino acid" or "amino acid sequence" include an oligopeptide, peptide, 

20 polypeptide, or protein sequence, or to a fragment, portion, or subunit of any of these, and to 
naturally occurring or synthetic molecules. The term "polypeptide" as used herein, refers to 
amino acids joined to each other by peptide bonds or modified peptide bonds, Le. 9 peptide 
isosteres and may contain modified amino acids other than the 20 gene-encoded amino acids. 
The polypeptides maybe modified by either natural processes, such as post-translational 

25 processing, or by chemical modification techniques that are well known in the art. 

Modifications can occur anywhere in the polypeptide, including the peptide backbone, the 
amino acid side-chains and the amino or carboxyl termini. It will be appreciated that the 
same type of modification may be present in the same or varying degrees at several sites in a 
given polypeptide. Also a given polypeptide may have many types of modifications. 

30 Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 
nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent 
attachment of a phosphytidylinositol, cross-linking cyclization, disulfide bond formation, 
demethylation, formation of covalent cross-links, formation of cysteine, formation of 
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pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, 
hydroxylation, iodination, methylation, myristolyation, oxidation, pegylation, xylan 
hydrolase processing, phosphorylation, prenylation, racemization, selenoylation, sulfation 
and transfer-RNA mediated addition of amino acids to protein such as arginylation. (See 
5 Creighton, T.E., Proteins - Structure and Molecular Properties 2nd Ed., W.H. Freeman and 
Company, New York (1993); Posttranslational Covalent Modification of Proteins, B.C. 
Johnson, Ed., Academic Press, New York, pp. 1-12 (1983)). The peptides and polypeptides 
of the invention also include all "mimetic" and "peptidomimetic" forms, as described in 
further detail, below, 

10 As used herein, the term "isolated" means that the material is removed from its 

original environment (e.g., the natural environment if it is naturally occurring). For example, 
a naturally<>cx:urring polynucleotide or polypeptide present in a living animal is not isolated, 
but the same polynucleotide or polypeptide, separated from some or all of the coexisting 
materials in the natural system, is isolated. Such polynucleotides could be part of a vector 

15 and/or such polynucleotides or polypeptides could be part of a composition and still be 
isolated in that such vector or composition is not part of its natural environment. As used 
herein, the term "purified" does not require absolute purity, rather, it is intended as a relative 
definition. Individual nucleic acids obtained from a library have been conventionally purified to 
electrophoretic homogeneity. The sequences obtained from these clones could not be obtained 

20 directly either from the library or from total human DNA. The purified nucleic acids of the 
invention have been purified from the remainder of the genomic DNA in the organism by at 
least 10 4 -10 6 fold. However, the term purified" also includes nucleic acids that have been 
purified from the remainder of the genomic DNA or from other sequences in a library or other 
environment by at least one order of magnitude, typically two or three orders and more typically 

25 four or five orders of magnitude. 

As used herein, the term "recombinant" means that the nucleic acid is adjacent to 
a backbone" nucleic acid to which it is not adjacent in its natural environment Additionally, to 
be "enriched" the nucleic acids will represent 5% or more of the number of nucleic acid inserts 
in a population of nucleic acid backbone molecules. Backbone molecules according to the 

30 invention include nucleic acids such as expression vectors, self-replicating nucleic acids, viruses, 
integrating nucleic acids and other vectors or nucleic acids used to maintain or manipulate a 
nucleic acid insert of interest. Typically, the enriched nucleic acids represent 15% or more of 
the number of nucleic acid inserts in the population of recombinant backbone molecules. More 
typically, the enriched nucleic acids represent 50% or more of the number of nucleic acid inserts 
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in the population of recombinant backbone molecules. In a one aspect, the enriched nucleic 
acids represent 90% or more of the number of nucleic acid inserts in the population of 
recombinant backbone molecules. 

"Recombinant" polypeptides or proteins refer to polypeptides or proteins 
5 produced by recombinant DNA techniques; i.e., produced from cells transformed by an 
exogenous DNA construct encoding the desired polypeptide or protein. "Synthetic" 
polypeptides or protein are those prepared by chemical synthesis. Solid-phase chemical 
peptide synthesis methods can also be used to synthesize the polypeptide or fragments of the 
invention. Such method have been known in the art since the early 1 960 r s (Merrifield, R. B., J. 

10 Am. Chem. Soc, 85:2149-2154, 1963) (See also Stewart, J. M. and Young, J. D., Solid Phase 
Peptide Synthesis, 2nd Ed ., Pierce Chemical Co., Rockford, HI., pp. 11-12)) and have recently 
been employed in commercially available laboratory peptide design and synthesis kits 
(Cambridge Research Biochemicals). Such commercially available laboratory kits have 
generally utilized the teachings of H. M. Geysen et al, Proa Natl. Acad. Scu, USA, 81:3998 

15 (1984) and provide for synthesizing peptides upon the tips of a multitude of "rods" or <f pins" all 
of which are connected to a single plate. When such a system is utilized, a plate of rods or pins is 
inverted and inserted into a second plate of corresponding wells or reservoirs, which contain 
solutions for attaching or anchoring an appropriate amino acid to the pin's or rod's tips. By 
repeating such a process step, i.e., inverting and inserting the rod's and pin's tips into appropriate 

20 solutions, amino acids are built into desired peptides, hi addition, a number of available FMOC 
peptide synthesis systems are available. For example, assembly of a polypeptide or fragment can 
be carried out on a solid support using an Applied Biosystems, Inc. Model 431 A automated 
peptide synthesizer. Such equipment provides ready access to the peptides of the invention, 
either by direct synthesis or by synthesis of a series of fragments that can be coupled using 

25 other known techniques. 

A promoter sequence is "operably linked to" a coding sequence when RNA 
polymerase which initiates transcription at the promoter will transcribe the coding sequence 
into mRNA. 

"Plasmids" are designated by a lower case "p" preceded and/or followed by 
30 capital letters and/or numbers. The starting plasmids herein are either commercially 

available, publicly available on an unrestricted basis, or can be constructed from available 
plasmids in accord with published procedures. In addition, equivalent plasmids to those 
described herein are known in the art and will be apparent to the ordinarily skilled artisan. 
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'Digestion" of DNA refers to catalytic cleavage of the DNA with a restriction 
enzyme that acts only at certain sequences in the DNA. The various restriction enzymes used 
herein are commercially available and their reaction conditions, cofactors and other 
requirements were used as would be known to the ordinarily skilled artisan. For analytical 
5 purposes, typically 1 jig of plasmid or DNA fragment is used with about 2 units of enzyme in 
about 20 \x\ of buffer solution. For the purpose of isolating DNA fragments for plasmid 
construction, typically 5 to 50 \ig of DNA are digested with 20 to 250 units of enzyme in a 
larger volume. Appropriate buffers and substrate amounts for particular restriction enzymes 
are specified by the manufacturer. Incubation times of about 1 hour at 37°C are ordinarily 

10 used, but may vary in accordance with the supplier's instructions. After digestion, gel 
electrophoresis may be performed to isolate the desired fragment. 

The phrase "substantially identical** in the context of two nucleic acids or 
polypeptides, refers to two or more sequences that have, e.g., at least about 50%, 51%, 52%, 
53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 

15 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 
85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 
more nucleotide or amino acid residue (sequence) identity, when compared and aligned for 
maximum correspondence, as measured using one of the known sequence comparison 
algorithms or by visual inspection. Typically, the substantial identity exists over a region of 

20 at least about 100 residues and most commonly the sequences are substantially identical over 
at least about 150-200 residues. In some aspects, the sequences are substantially identical 
over the entire length of the coding regions. 

Additionally a "substantially identical" amino acid sequence is a sequence that 
differs from a reference sequence by one or more conservative or non-conservative amino 

25 acid substitutions, deletions, or insertions, particularly when such a substitution occurs at a 
site that is not the active site of the molecule and provided that the polypeptide essentially 
retains its functional properties. A conservative amino acid substitution, for example, 
substitutes one amino acid for another of the same class (e.g., substitution of one hydrophobic 
amino acid, such as isoleucine, valine, leucine, or methionine, for another, or substitution of 

30 one polar amino acid for another, such as substitution of arginine for lysine, glutamic acid for 
aspartic acid or glutamine for asparagine). One or more amino acids can be deleted, for 
example, from a xylanase polypeptide, resulting in modification of the structure of the 
polypeptide, without significantly altering its biological activity. For example, amino- or 
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carboxyl-terminal amino acids that are not required for xylanase biological activity can be 
removed. Modified polypeptide sequences of the invention can be assayed for xylanase 
biological activity by any number of methods, including contacting the modified polypeptide 
sequence with a xylanase substrate and determining whether the modified polypeptide 

5 decreases the amount of specific substrate in the assay or increases the bioproducts of the 
enzymatic reaction of a functional xylanase polypeptide with the substrate. 

"Fragments" as used herein are a portion of a naturally occurring protein 
which can exist in at least two different conformations. Fragments can have the same or 
substantially the same amino acid sequence as the naturally occurring protein. "Substantially 

10 the same" means that an amino acid sequence is largely, but not entirely, the same, but retains 
at least one functional activity of the sequence to which it is related. In general two amino 
acid sequences are "substantially the same" or "substantially homologous" if they are at least 
about 85% identical. Fragments which have different three dimensional structures as the 
naturally occurring protein are also included. An example of this, is a "pro-form" molecule, 

1 5 such as a low activity proprotein that can be modified by cleavage to produce a mature 
enzyme with significantly higher activity. 

t6 Hybridization" refers to the process by which a nucleic acid strand joins with 
a complementary strand through base pairing. Hybridization reactions can be sensitive and 
selective so that a particular sequence of interest can be identified even in samples in which it 

20 is present at low concentrations. Suitably stringent conditions can be defined by, for 

example, the concentrations of salt or formamide in the prehybridization and hybridization 
solutions, or by the hybridization temperature and are well known in the art. In particular, 
stringency can be increased by reducing the concentration of salt, increasing the 
concentration of formamide, or raising the hybridization temperature. In alternative aspects, 

25 nucleic acids of the invention are defined by their ability to hybridize under various 
stringency conditions (e.g., high, medium, and low), as set forth herein. 

For example, hybridization under high stringency conditions could occur in 
about 50% formamide at about 37°C to 42°C. Hybridization could occur under reduced 
stringency conditions in about 35% to 25% formamide at about 30°C to 35°C. In particular, 

30 hybridization could occur under high stringency conditions at 42°C in 50% formamide, 5X 
SSPE, 0.3% SDS and 200 n/ml sheared and denatured salmon sperm DNA. Hybridization 
could occur under reduced stringency conditions as described above, but in 35% formamide 
at a reduced temperature of 35°C. The temperature range corresponding to a particular level 
of stringency can be further narrowed by calculating the purine to pyrimidine ratio of the 
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nucleic acid of interest and adjusting the temperature accordingly. Variations on the above 
ranges and conditions are well known in the art. 

The term 'Variant" refers to polynucleotides or polypeptides of the invention 
modified at one or more base pairs, codons, introns, exons, or amino acid residues 
5 (respectively) yet still retain the biological activity of a xylanase of the invention. Variants 
can be produced by any number of means included methods such as, for example, error-prone 
PCR, shuffling, oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR 
mutagenesis, in vivo mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, 
exponential ensemble mutagenesis, site-specific mutagenesis, gene reassembly (e.g., 
10 GeneReassembly™, see, e.g., U.S. Patent No. 6,537,776), GSSM™ and any combination 
thereof. 

Table 1 and Table 2 list variants obtained by mutating SEQ ID NO: 189 
(encoding SEQ ID NO:190) by GSSM™. The invention provides nucleic acids having one 
or more, or all, of the sequences as set forth in Tables 1 and 2, i.e., nucleic acids having 

15 sequences that are variants of SEQ ED NO:189, where the variations are set forth in Table 1 
and Table 2, and the polypeptides that are encoded by these variants. 

These GSSM™ variants (set forth in Tables 1 and 2) were tested for thermal 
tolerance (see Examples, below). Mutants D, F, G, H, I, J, K, S, T, U, V, W, X, Y, Z, AA , 
DD and EE were found to have the highest thermal tolerance among the mutants in Table 1 . 

20 Mutants may also be combined to form a larger mutant. For example, mutants D, F, H, I, S, 
V, X and AA of Table 1 were combined to form a larger mutant termed "8x" with a sequence 
as set forth in SEQ ID NO:375 (polypeptide encoding nucleic acid) and SEQ ID NO:376 
(amino acid sequence). Figure 5 is a graph comparing the activity of the wild type sequence 
(SEQ ID NOS: 189 and 190) to the 8x mutant (SEQ ID NOS: 259 and 260). In comparing 

25 the wild type and the 8x mutant, it was discovered that the optimal temperature for both was 
65°C and that the optimal pH for both was 5.5. The wild type sequence was found to 
maintain its stability for less than 1 minute at 65°C, while the 8x mutant (SEQ ID NOS:375, 
376) was found to maintain its stability for more than 10 minutes at 85°C. The substrate used 
was AZO-AZO-xylan. In one aspect, the 8x mutant (SEQ ID NOS:375, 376) was evolved by 

30 GSSM™. In another aspect, the wild type is a GSSM™ parent for thermal tolerance 
evolution. 
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Mutant 


Mutation 


Wild type Sea 


GSSM™ Sea 


A 


A2F 


GCC 


1 1 1 


B 


A2D 


GCC 


GAC 


c 


A5H 


GCT 


CAC 


D 


D8F 


GAC 


TTC 


E 


Q11L 


CAA 


CTC 


F 


011H 


CAA 


CAC 




IM 1 ^.1— 


AAT 


TTG 

1 1 W 


u 
n 




AAT 

AA/A 1 


TTG 
i i w 


1 


G17I 

wl / 1 


GGT 

WW I 


ATA 

AA 1 /A 


I 


w i i n, l /lo i 


OAA AOC 


CAT ACG 


fx 


H11H 
w i I n 


HA A 

wAAAV 


CAT 

w/A 1 


i 

L. 




i v i 


CCG 


M 
1VI 


cocp 


TOT 

1 w 1 


CCA 

ww/A 


M 
IN 


QOCC 

ooor 


TO A 
1 w/A 


1 1 1 
III 


r\ 
\J 


M/"» f* hanno 

ino wiianye 


V_J | 1 


GTA 

V3 1 /A 


p 

r 


A^1 P 


V3UM 


CCG 




AC<1p 




CCG 




w\_>vjr\ 




CGC 


o 


ouun 


GGA 


CAC 


T 


GROH 


GGA 

V3VJ/A 


CAC 


1 1 


PR4C 


CCG 


TGT 


V 


PR4V 


cog 


GTA 


\A/ 

vv 


PR4V 

1 V* V 




GTT 


y 


OLIO V 


TCC 


GTG 


V 

T 


U I 111 


CAA 

w/ATA 


CAT 


7 


GfiRI 

WVJQ/I 


GGA 


ATA 


AA 


GfiftA 

OUOn 


GGA 

w w/A 


GCT 


DD 

DD 


A71G 


GOT 


GGA 


or* 


I tiu ui idi iy tJ 


AAT 

/A/A 1 


AAC 

AAAAw 


nn 


1 ^7QP 


1 w/A 


CCA 


pp 


Ci7QP 
o fyr 


i w/A 


ccc 

WW w 


FF 


T95S 


ACT 


TCT 


GG 


Y98P 


TAT 


CCG 


HH 


T114S 


ACT 


AGC 


II 


No Change 


AAC 


AAC 


JJ 


No Change 


AGG 


AGA 


KK 


I142L 


ATT 


CTG 


LL 


S151I 


AGC 


ATC 


MM 


S138T.S151A 


TCG.AGC 


ACG.GCG 


NN 


I K158R 


AAG 


CGG 


OO 


K160V.V172I 


AAA.GTA 


GTT.ATA 



The codon variants as set forth in Table 2 that produced variants (of SEQ ID 
NO:189) with the best variation or "improvement" over "wild type" (SEQ ID NO:189) in 
thermal tolerance are highlighted. As noted above, the invention provides nucleic acids, and 
5 the polypeptides that encode them, comprising one, several or all or the variations set forth in 
Table 2 and Table 1. 
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Table 2 



Mutation 


Wild tvoe 


oooM 


uinsr coQono dibu uuuiiim iui. 




OCljUul IOC 




same chanaed amino acid 


A2F 


GCC 


TTT 


TTC 


A2D 


GCC 


GAC 


GAT 


A5H 


GOT 


CAC 


CAT 


m 


GAC 


n 


TTT 


Q11L 


CAA 


CTC 


TTA, TTG, CTT, CTA, CTG 


Ml 


CAA 


BBS. 1888 


- 


mm 


AAT 


HH 


TTA, CTC, CTT, CTA, CTG 




GGT 


Hi 


ATT, ATC 


T23T 


ACC 


ACG 


ACT, ACC, ACA 


S26P 


TCT 


CCG, CCA 


CCC 


S35F 


TCA 


TTT 


TTC 


A51P 


GCA 


CCG 


CCC, CCA 


G60R 


GGA 


CGC 


CGT, CGA, CGG, AGA, AGG 


mmm 


GGA 


HI 


CAT 


MM 


CCG 


SI 


TGC 


IB 


CCG 


1111111 


GTC, GTG 


ESS 


TCC 


mi 


GTC, GTA, GTT 


m 


GGA 


m 


ATT, ATC 


mmm 


GGA 




GCG, GCC, GCA 


A71G 


GCT 


GGA 


GGT, GGC, GGG 


HB1 


TCA 




CCG 


T95S 


ACT 


TCT 


TCC, TCA, TCG, AGT, AGC 


Y98P 


TAT 


CCG 


CCC, CCA 


T114S 


ACT 


AGC 


TCC, TCA, TCG, AGT, TCT 


I142L 


ATT 


CTG 


TTA, CTC, CTT, CTA, TTG 


S151I 


AGC 


ATC 


ATT, ATA 


S138T 


TCG 


ACG 


ACT, ACC, ACA 


S151A 


AGC 


GCG 


GCT, GCC, GCA 


K158R 


AAG 


CGG 


CGT, CGA, CGC, AGA, AGG 


K160V 


AAA 


GTT 


GTC, GTA, GTG 


V1721 


GTA 


ATA 


ATT, ATC 



In one aspect the amino acid sequence of an amino acid sequence (SEQ ID 
NO: 208) of Group B amino acid sequences is modified by a single amino acid mutation. In 
5 a specific aspect, that mutation is an asparagine to aspartic acid mutation. The resulting 
amino acid sequence and corresponding nucleic acid sequence are set forth as SEQ ID 
NO:252 and SEQ ID NO:251, respectively. Single amino acid mutations with an 
improvement in the pH optimum of the enzyme, such as the mutation of SEQ ID NO:208, 
have been shown in the art with respect to xylanases. (See, for example, Joshi, M., Sidhu, G., 
10 Pot, L, Brayer, G., Withers, S., Mcintosh, L., J. MoL Bio, 2 99. 255-279 (2000).) It is also 
noted that in such single amino acid mutations, portions of the sequences may be removed in 
the subcloning process. For example, SEQ ID NO:207 and SEQ ID NO:25 1 differ in only 
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one nucleotide, over the area that the sequences align. However, it is noted that a 78 
nucleotide area at the N-terminus of SEQ ED NO:207 was removed from the N-terminus of 
SEQ ID NO:251 in the subcloning. Additionally, the first three nucleotides in SEQ ID 
NO:251 were changed to ATG and then the point mutation was made at the sixth nucleotide 
5 inSEQEDNO:251. 

The term "saturation mutagenesis", "gene site saturated mutagenesis" or 
"GSSM™" includes a method that uses degenerate oligonucleotide primers to introduce point 
mutations into a polynucleotide, as described in detail, below. 

The term "optimized directed evolution system" or "optimized directed 
10 evolution" includes a method for reassembling fragments of related nucleic acid sequences, 
e.g., related genes, and explained in detail, below. 

The term "synthetic ligation reassembly" or "SLR" includes a method of 
ligating oligonucleotide fragments in a non-stochastic fashion, and explained in detail, below. 

Generating and Manipulating Nucleic Acids 

15 The invention provides nucleic acids (e.g., SEQ ED NO:l, SEQ ED NO:3, SEQ 

ED NO:5, SEQ ED NO:7, SEQ ED NO:9, SEQ ED NO:ll, SEQ ED NO:13, SEQ ED NO:15, 
SEQ ID NO:17, SEQ ED NO:19, SEQ ED NO:21, SEQ ED NO:23, SEQ ED NO:25, SEQ ED 
NO:27, SEQ ED NO:29, SEQ ED NO:31, SEQ ED NO:33, SEQ ED NO:35, SEQ ED NO:37, 
SEQ ED NO:39, SEQ ED NO:41, SEQ ED NO:43, SEQ ED NO:45, SEQ ED NO:47, SEQ ED 

20 NO:49, SEQ ED NO:5 1 , SEQ ED NO:53, SEQ ED NO:55, SEQ ED NO:57, SEQ ED NO:59, 
SEQ ED NO:61, SEQ ED NO:63, SEQ ED NO:65, SEQ ED NO:67, SEQ ED NO:69, SEQ ED 
NO:71, SEQ ED NO:73, SEQ ED NO:75, SEQ ED NO:77, SEQ ED NO:79, SEQ ED NO:81, 
SEQ ED NO:83, SEQ ED NO:85, SEQ ED NO:87, SEQ ED NO:89, SEQ ED NO:91, SEQ ED 
NO:93, SEQ ED NO:95, SEQ ED NO:97, SEQ ED NO:99, SEQ ED NO:101, SEQ ED NO:103, 

25 SEQ ED NO:105, SEQ ED NO:107, SEQ ED NO:109, SEQ ED NO:lll, SEQ ED NO:113, 
SEQ ED NO:115, SEQ ED NO:117, SEQ ED NO:119, SEQ ED NO:121, SEQ ED NO:123, 
SEQ ED NO:125, SEQ ED NO:127, SEQ ED NO:129, SEQ ED NO:131, SEQ ED NO:133, 
SEQ ED NO:135, SEQ ED NO:137, SEQ ED NO:139, SEQ ED NO:141, SEQ ED NO:143, 
SEQ ED NO:145, SEQ ED NO:147, SEQ ED NO:149, SEQ ED NO:151, SEQ ED NO:153, 

30 SEQ ED NO:155, SEQ ED NO: 157, SEQ ED NO:199, SEQ ED NO:161, SEQ ED NO:163, 
SEQ ED NO:165, SEQ ED NO:167, SEQ ED NO:169, SEQ ED NO:171, SEQ ED NO:173, 
SEQ ED NO:175, SEQ ED NO:177, SEQ ED NO:179, SEQ ED NO:181, SEQ ED NO:183, 
SEQ ED NO:185, SEQ ED NO:187, SEQ ED NO:189, SEQ ED NO:191, SEQ ED NO:193, 
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SEQ ID NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID NO.201, SEQ ID NO:203, 
SEQ ID NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID NO:211, SEQ ID NO:213, 
SEQ ID NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ID NO:221, SEQ ID NO:223, 
SEQ ID NO:225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID NO:231, SEQ ID NO:233, 

5 SEQ ED NO:235, SEQ ID NO:237, SEQ ID NO:239, SEQ ID NO:241, SEQ ID NO:243, 
SEQ E> NO:245, SEQ DD NO:247, SEQ ID NO:249, SEQ ID NO:251, SEQ ID NO:253, 
SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:261, SEQ ID NO:263, 
SEQ ID NO:265, SEQ ID NO:267, SEQ ID NO:269, SEQ ID NO:271, SEQ ID NO:273, 
SEQ ID NO:275, SEQ ID NO:277, SEQ ID NO:279, SEQ ID NO:281, SEQ ID NO:283, 

10 SEQ ID NO:285, SEQ ID NO:287, SEQ ID NO:289, SEQ ID NO:291, SEQ ID NO:293, 
SEQ ID NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ID NO:301, SEQ ID NO:303, 
SEQ ID NO:305, SEQ ID NO:307, SEQ ID NO:309, SEQ ID NO:311, SEQ EDNO:313, 
SEQ ID NO:315, SEQ ID NO:3 17, SEQ ID NO:319, SEQ ID NO:321, SEQ ID NO:323, 
SEQ ID NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ID NO:331, SEQ E>NO:333, 

15 SEQ ID NO:335, SEQ ID NO:337, SEQ ID NO:339, SEQ ID NO:341, SEQ ID NO:343, 
SEQ ID NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ID NO:351, SEQ ID NO:353, 
SEQ ID NO:355, SEQ ID NO:357, SEQ ID NO:359, SEQ ID NO:361, SEQ ID NO:363, 
SEQ ID NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ID NO:371, SEQ ID NO:373, 
SEQ ID NO:375, SEQ ID NO:377 or SEQ ID NO:379; nucleic acids encoding polypeptides 

20 as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 1 0, 
SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID 
NO:22, SEQ ID NO:24, SEQ ED NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, 
SEQ ID NO:34, SEQ ID NO:36, SEQ ED NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ DD 
NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, 

25 SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID 
NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ ID NO:74, SEQ ID NO:76, 
SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ K> NO:86, SEQ ID 
NO:88, SEQ ID NO:90, SEQ ID NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID NO:98, 
SEQ ID NO:100, SEQ ID NO: 102, SEQ ED NO:104, SEQ ID NO:106, SEQ ID NO:108, 

30 SEQ BD NO:l 10, SEQ ID NO: 112, SEQ ED NO:l 14, SEQ ED NO:l 16, SEQ ID NO:l 1 8, 
SEQ ID NO:120, SEQ ED NO:122, SEQ ID NO:124, SEQ ED NO:126, SEQ ED NO:128, 
SEQ ED NO:130, SEQ ED NO:132; SEQ ID NO:134; SEQ ED NO:136; SEQ ID NO:138; 
SEQ ED NO:140; SEQ ID NO:142; SEQ ED NO:144; NO:146, SEQ BD NO:148, SEQ ED 
NO:150, SEQ ED NO:152, SEQ ED NO:154, SEQ ED NO:156, SEQ ED NO:158, SEQ DD 
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NO:160, SEQ ID NO:162, SEQ ID NO:164, SEQ ID NO:166, SEQ ID NO:168, SEQ ID 
NO:170, SEQ ID NO:172, SEQ ID NO:174, SEQ ID NO:176, SEQ ED NO:178, SEQ ID 
NO:180, SEQ ID NO:182, SEQ ID NO:184, SEQ ID NO:186, SEQ ID NO:188, SEQ ID 
NO:190, SEQ ID NO:192, SEQ ID NO:194, SEQ ID NO:196, SEQ ID NO:198, SEQ ID 

5 NO:200, SEQ ID NO:202, SEQ ED NO:204, SEQ ID NO:206, SEQ ED NO:208, SEQ ED 
NO:210, SEQ ID NO:212, SEQ ID NO:214, SEQ ED NO:216, SEQ ED NO:218, SEQ ID 
NO:220, SEQ ED NO:222, SEQ ID NO:224, SEQ ID NO:226, SEQ ED NO:228, SEQ ID 
NO:230, SEQ ED NO:232, SEQ ID NO:234, SEQ ID NO:236, SEQ ED NO:238, SEQ ID 
NO:240, SEQ ID NO:242, SEQ ID NO:244, SEQ ID NO:246, SEQ ID NO:248, SEQ ID 

1 0 NO:250, SEQ ID NO:252, SEQ ID NO:254, SEQ ED NO:256, SEQ ED NO:258, SEQ ID 
NO.260, SEQ ED NO:262, SEQ ED NO:264, SEQ ID NO:266, SEQ ID NO:268, SEQ ID 
NO:270, SEQ ED NO:272, SEQ ID NO:274, SEQ ED NO:276, SEQ ID NO:278, SEQ ID 
NO:280, SEQ ID NO:282, SEQ ID NO:284, SEQ ID NO:286, SEQ ID NO:288, SEQ ID 
NO:290, SEQ ED NO:292, SEQ ID NO:294, SEQ ID NO:296, SEQ ID NO:298, SEQ ED 

1 5 NO:300, SEQ ID NO:302, SEQ ID NO:304, SEQ ID NO:306, SEQ ED NO:308, SEQ ID 
NO:310, SEQ ED NO:312, SEQ ID NO:314, SEQ ID NO:316, SEQ ID NO:318, SEQ ID 
NO:320, SEQ ID NO:322, SEQ ID NO:324, SEQ ID NO:326, SEQ ID NO:328, SEQ ED 
NO:330, SEQ ID NO:332, SEQ ID NO:334, SEQ ID NO:336, SEQ ID NO:338, SEQ ED 
NO:340, SEQ ID NO:342, SEQ ED NO:344, SEQ ID NO:346, SEQ ID NO:348, SEQ ID 

20 NO:350, SEQ ID NO:352, SEQ ED NO:354, SEQ ID NO:356, SEQ ED NO:358, SEQ ED 
NO:360, SEQ ID NO:362, SEQ ID NO:364, SEQ ID NO:366, SEQ ID NO:368, SEQ ED 
NO:370, SEQ ID NO:372, SEQ ID NO:374, SEQ ID NO:376, SEQ ED NO:378 or SEQ ID 
NO:380), including expression cassettes such as expression vectors, encoding the 
polypeptides of the invention. The invention also includes methods for discovering new 

25 xylanase sequences using the nucleic acids of the invention. The invention also includes 
methods for inhibiting the expression of xylanase genes, transcripts and polypeptides using 
the nucleic acids of the invention. Also provided are methods for modifying the nucleic acids 
of the invention by, e.g., synthetic ligation reassembly, optimized directed evolution system 
and/or saturation mutagenesis. 

30 The nucleic acids of the invention can be made, isolated and/or manipulated 

by, e.g., cloning and expression of cDNA libraries, amplification of message or genomic 
DNA by PCR, and the like. For example, the following exemplary sequences of the 
invention were initially derived from the following sources: 
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Table 3 





SEOID 


SOURCE 




1.2 


Bacteria 




101, 102 


Environmental 


5 


103, 104 


Bacteria 




105, 106 


Environmental 




107, 108 


Bacteria 




109, 110 


Environmental 




11,12 


Environmental 


10 


111, 112 


Environmental 




113, 114 


Environmental 




115, 116 


Environmental 




117, 118 


Environmental 




119, 120 


Environmental 


15 


121, 122 


Environmental 




123, 124 


Environmental 




125, 126 


Environmental 




127, 128 


Environmental 




129, 130 


Bacteria 


20 


13, 14 


Environmental 




131, 132 


Environmental 




133, 134 


Environmental 




135, 136 


Environmental 




137, 138 


Environmental 


25 


139, 140 


Environmental 




141, 142 


Environmental 




143, 144 


Bacteria 




145, 146 


Eukaryote 




147, 148 


Environmental 


30 


149, 150 


Environmental 




15, 16 


Environmental 




151, 152 


Environmental 




153, 154 


Environmental 




155, 156 


Environmental 


35 


157, 158 


Environmental 




159, 160 


Environmental 




161, 162 


Environmental 




163, 164 


Environmental 




165, 166 


Environmental 


40 


167, 168 


Environmental 




169, 170 


Environmental 




17,18 


Bacteria 




171, 172 


Environmental 




173, 174 


Environmental 


45 


175, 176 


Environmental 




177, 178 


Environmental 




179, 180 


Environmental 




181, 182 


Environmental 




183, 184 


Environmental 


50 


185, 186 


Environmental 
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1 C*7 ICR 
lo/, loo 


iinvironmeniai 




i so i on 


j^nvironineniai 




i o on 
iy, zu 


jzjii v j roiiiTi en ui 1 




101 1Q0 

iyi, iyz 


jjfiivironiiieii idi 


c 

D 


1 1 Ozl 

iyj, iy*f 


jjfnviroiiiiieniai 




iyD, iyo 


j^nvironmeniai 




iy/, iy© 


i^nvironmeniai 




1 oo oaa 

iyy, zuu 


iinvironmeniai 




rjAi OAO 

201, 202 


Environmental 


1 n 
1U 


oao OA/1 
203, 204 


Environmental 




OA< OA/J 

205, 206 


Environmental 




o at ono 

207, 20o 


Environmental 




209, 210 


Environmental 




A 1 oo 

21,22 


Environmental 


15 


O 1 1 o i o 

211, 212 


Environmental 




A 1 *5 O 1 /! 

213, 214 


Environmental 




Air O 1 /C 

215, 216 


Environmental 




on o 1 © 

217, 218 


Environmental 




rti a OO A 

219, 220 


Environmental 


20 


OO 1 MOO 

221, 222 


Environmental 




OOO AO/I 

223, 224 


Environmental 




oo< OO/C 
225, ZZ6 


unvEronmeniai 




00"7 OOO 

227, 22o 


Environmental 




OOA 0*5A 

229, 230 


Environmental 


o c 
25 


o*2 o/l 
23, 24 


Environmental 




Oa 1 0*20 

23 1 , 232 


Bacteria 




o m oi/i 

233, 234 


Environmental 




235, 236 


Environmental 




0^5*7 OOQ 

237, 23o 


Environmental 


30 


O O A O A A 

239, 240 


Environmental 




O yf 1 O /4 O 

241, 242 


Environmental 




O >f O O /I vl 

243, 244 


Environmental 




245, 246 


Environmental 




O /IT O/IO 

247, 24o 


Environmental 


35 


O yl A OCA 

249, 250 


Environmental 




or O/C 

25, 26 


Environmental 




OC1 oco 
251, 252 


Environmental 




253, 254 


Environmental 




Z55, Z56 


Environmental 


40 


OC7 OC© 

Z5 /, Z5o 


Environmental 




OCA. O/^A 

Z59, Z6U 


Environmental 




o/ri o<o 
Z61, Z&Z 


Environmental 




263, 264 


Environmental 




O/CC o<< 

265, 266 


Environmental 


A C 

45 


Z6/, z6o 


Bacteria 




zoy, z/u 


Environmental 




27, 28 


Environmental 




271, 272 


Environmental 




273, 274 


Environmental 


50 


275, 276 


Environmental 
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277, 278 Environmental 

279, 280 Environmental 

28 1 , 282 Environmental 

283, 284 Environmental 

5 285, 286 Environmental 

287, 288 Environmental 

289, 290 Environmental 

29, 30 Archaea 

29 1 , 292 Environmental 

10 293, 294 Environmental 

295, 296 Environmental 

297, 298 Environmental 

299, 300 Environmental 

3, 4 Environmental 

15 30 1 , 302 Environmental 

303, 304 Environmental 

305, 306 Bacteria 

307, 308 Environmental 

309, 310 Environmental 

20 31, 32 Environmental 

311, 312 Environmental 

313,314 Bacteria 

315,316 Environmental 

3 1 7, 3 1 8 Environmental 

25 319, 320 Environmental 

321, 322 Environmental 

323, 324 Environmental 

325, 326 Environmental 

327, 328 Environmental 

30 329, 330 Environmental 

33, 34 Environmental 

331, 332 Environmental 

333, 334 Environmental 

335, 336 Environmental 

35 337, 338 Environmental 

339, 340 Environmental 

341, 342 Environmental 

343, 344 Environmental 

345, 346 Environmental 

40 347, 348 Environmental 

349, 350 Environmental 

35, 36 Environmental 

351, 352 Environmental 

353, 354 Environmental 

45 355, 356 Environmental 

357, 358 Environmental 

359, 360 Environmental 

361, 362 Environmental 

363, 364 Environmental 

50 365, 366 Environmental 
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jO / , jOO 


"P n vi mnm pn t 1 

iJ/Lx V 11 Will J.I t/JJUCU. 




^70 


nvi rnrnn f*n t 1 




j/) jo 


Pn vi ronm en tal 




^71 ^7? 


P?n vi ronm p.n t a 1 


c 
J 


1T1 ^7A 
3/3, 3 /*r 


T?n vi Ton m pn t n 1 
JDJU V 11 UlllllGi 1 till 




3/3, 3 /O 


A Wini/M dl 
jTvl UllL/lcu 




3 / /, 3 /O 


/\riiiioicu. 




39, 4U 


iinviroruiiciiLcii 




/1 1 AO 
41, 4Z 


J2I1V IXUI11I1C11 ld.1 


1 A 
10 


/ll A A 
43, 44 


jDnviroiirneiiiai 




>l C AC 

45, 46 


environmental 




47, 48 


Environmental 




49, 50 


Environmental 




5, 0 


rsnvironin en tai 


15 


51, 52 


Environmental 




53, 54 


Bacteria 




55, 56 


environmental 




57, 58 


Environmental 




59, 6U 


environmental 


20 


61, 62 


environmental 




63, 64 


enviixjnrnentai 




65, 66 


environmental 




6/, 60 


xinviiunineiiLdi 




69, /U 


en viroiiiiieiii <±i 


25 


7, 8 


envnonmentai 




71 77 


T7 TIT M tV^Tl TY1 AH T C1 1 

unv iroiiiii ciiLai 




HI HA 
/3, /4 


environmental 




/3, /6 


envnonmentai 




77, 78 


envHonrneniai 


30 


79, 80 


envnonmentai 




81, 82 


Environmental 




Ol OA 

83, 84 


Environmental 




85, 86 


o act en a 




on 00 
87, 88 


Environmental 










9, 10 


Environmental 




91,92 


Environmental 




93,94 


Environmental 




95, 96 


Environmental 


40 


97, 98 


Environmental 




99, 100 


Environmental 



In one aspect, the invention also provides xylanase-encoding nucleic acids 
with a common novelty in that they are derived from an environmental source, or a bacterial 
source, or an archaeal source. 
45 In practicing the methods of the invention, homologous genes can be modified 

by manipulating a template nucleic acid, as described herein. The invention can be practiced 
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in conjunction with any method or protocol or device known in the art, which are well 
described in the scientific and patent literature. 

One aspect of the invention is an isolated nucleic acid comprising one of the 
sequences of Group A nucleic acid sequences and sequences substantially identical thereto, 
5 the sequences complementary thereto, or a fragment comprising at least 10, 15, 20, 25, 30, 35, 
40, 50, 75, 100, 150, 200, 300, 400, or 500 consecutive bases of one of the sequences of a 
Group A nucleic acid sequence (or the sequences complementary thereto). The isolated, 
nucleic acids may comprise DNA, including cDNA, genomic DNA and synthetic DNA. The 
DNA may be double-stranded or single-stranded and if single stranded may be the coding 
10 strand or non-coding (anti-sense) strand. Alternatively, the isolated nucleic acids may 
comprise RNA. 

As discussed in more detail below, the isolated nucleic acids of one of the 
Group A nucleic acid sequences and sequences substantially identical thereto, may be used to 
prepare one of the polypeptides of a Group B amino acid sequence and sequences 

15 substantially identical thereto, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 
50, 75, 100, or 150 consecutive amino acids of one of the polypeptides of Group B amino acid 
sequences and sequences substantially identical thereto. 

Accordingly, another aspect of the invention is an isolated nucleic acid which 
encodes one of the polypeptides of Group B amino acid sequences and sequences 

20 substantially identical thereto, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 
50, 75, 100, or 150 consecutive amino acids of one of the polypeptides of the Group B amino 
acid sequences. The coding sequences of these nucleic acids may be identical to one of the 
coding sequences of one of the nucleic acids of Group A nucleic acid sequences, or a 
fragment thereof or may be different coding sequences which encode one of the polypeptides 

25 of Group B amino acid sequences, sequences substantially identical thereto and fragments 
having at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids of one 
of the polypeptides of Group B amino acid sequences, as a result of the redundancy or 
degeneracy of the genetic code. The genetic code is well known to those of skill in the art 
and can be obtained, for example, on page 214 of B. Lewin, Genes VL Oxford University 

30 Press, 1997. 

The isolated nucleic acid which encodes one of the polypeptides of Group B 
amino acid sequences and sequences substantially identical thereto, may include, but is not 
limited to: only the coding sequence of one of Group A nucleic acid sequences and sequences 
substantially identical thereto and additional coding sequences, such as leader sequences or 
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proprotein sequences and non-coding sequences, such as introns or non-coding sequences 5' 
and/or 3' of the coding sequence. Thus, as used herein, the term "polynucleotide encoding a 
polypeptide" encompasses a polynucleotide which includes only the coding sequence for the 
polypeptide as well as a polynucleotide which includes additional coding and/or non-coding 
5 sequence. 

Alternatively, the nucleic acid sequences of Group A nucleic acid sequences 
and sequences substantially identical thereto, may be mutagenized using conventional 
techniques, such as site directed mutagenesis, or other techniques familiar to those skilled in 
the art, to introduce silent changes into the polynucleotides of Group A nucleic acid 

10 sequences and sequences substantially identical thereto. As used herein, "silent changes" 
include, for example, changes which do not alter the amino acid sequence encoded by the 
polynucleotide. Such changes may be desirable in order to increase the level of the 
polypeptide produced by host cells containing a vector encoding the polypeptide by 
introducing codons or codon pairs which occur frequently in the host organism. 

1 5 The invention also relates to polynucleotides which have nucleotide changes 

which result in amino acid substitutions, additions, deletions, fusions and truncations in the 
polypeptides of Group B amino acid sequences and sequences substantially identical thereto. 
Such nucleotide changes may be introduced using techniques such as site directed 
mutagenesis, random chemical mutagenesis, exonuclease IE deletion and other recombinant 

20 DNA techniques. Alternatively, such nucleotide changes may be naturally occurring allelic 
variants which are isolated by identifying nucleic acids which specifically hybridize to probes 
comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 
consecutive bases of one of the sequences of Group A nucleic acid sequences and sequences 
substantially identical thereto (or the sequences complementary thereto) under conditions of 

25 high, moderate, or low stringency as provided herein. 

General Techniques 

The nucleic acids used to practice this invention, whether RNA, iRNA, 
antisense nucleic acid, cDNA, genomic DNA, vectors, viruses or hybrids thereof, may be 
isolated from a variety of sources, genetically engineered, amplified, and/or expressed/ 
30 generated recombinantly. Recombinant polypeptides (e.g., xylanases) generated from these 
nucleic acids can be individually isolated or cloned and tested for a desired activity. Any 
recombinant expression system can be used, including bacterial, mammalian, yeast, insect or 
plant cell expression systems. 
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Alternatively, these nucleic acids can be synthesized in vitro by well-known 
chemical synthesis techniques, as described in, e.g., Adams (1983) J. Am. Chem. Soc. 
105:661; Belousov (1997) Nucleic Acids Res. 25:3440-3444; Frenkel (1995) Free Radic. 
Biol. Med. 19:373-380; Blommers (1994) Biochemistry 33:7886-7896; Narang (1979) Meth. 

5 Enzymol. 68:90; Brown (1979) Meth. Enzymol. 68:109; Beaucage (1981) Tetra. Lett. 
22:1859; U.S. Patent No. 4,458,066. 

Techniques for the manipulation of nucleic acids, such as, e.g., subcloning, 
labeling probes (e.g., random-primer labeling using Klenow polymerase, nick translation, 
amplification), sequencing, hybridization and the like are well described in the scientific and 

10 patent literature, see, e.g., Sambrook, ed., MOLECULAR CLONING: A LABORATORY 
MANUAL (2ND ED.), Vols. 1-3, Cold Spring Harbor Laboratory, (1989); CURRENT 
PROTOCOLS IN MOLECULAR BIOLOGY, Ausubel, ed. John Wiley & Sons, Inc., New 
York (1997); LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR 
BIOLOGY: HYBRIDIZATION WITH NUCLEIC ACID PROBES, Part I. Theory and 

15 Nucleic Acid Preparation, Tijssen, ed. Elsevier, N.Y. (1993). 

Another useful means of obtaining and manipulating nucleic acids used to 
practice the methods of the invention is to clone from genomic samples, and, if desired, 
screen and re-clone inserts isolated or amplified from, e.g., genomic clones or cDNA clones. 
Sources of nucleic acid used in the methods of the invention include genomic or cDNA 

20 libraries contained in, e.g., mammalian artificial chromosomes (MACs), see, e.g., U.S. Patent 
Nos. 5,721,118; 6,025,155; human artificial chromosomes, see, e.g., Rosenfeld (1997) Nat. 
Genet. 15:333-335; yeast artificial chromosomes (YAC); bacterial artificial chromosomes 
(BAC); PI artificial chromosomes, see, e.g., Woon (1998) Genomics 50:306-316; Pl-derived 
vectors (PACs), see, e.g., Kern (1997) Biotechniques 23:120-124; cosmids, recombinant 

25 viruses, phages or plasmids. 

In one aspect, a nucleic acid encoding a polypeptide of the invention is 
assembled in appropriate phase with a leader sequence capable of directing secretion of the 
translated polypeptide or fragment thereof. 

The invention provides fusion proteins and nucleic acids encoding them. A 

30 polypeptide of the invention can be fused to a heterologous peptide or polypeptide, such as 
N-terminal identification peptides which impart desired characteristics, such as increased 
stability or simplified purification. Peptides and polypeptides of the invention can also be 
synthesized and expressed as fusion proteins with one or more additional domains linked 
thereto for, e.g., producing a more immunogenic peptide, to more readily isolate a 
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recombinantly synthesized peptide, to identify and isolate antibodies and antibody-expressing 
B cells, and the like. Detection and purification facilitating domains include, e.g., metal 
chelating peptides such as polyhistidine tracts and histidine-tryptophan modules that allow 
purification on immobilized metals, protein A domains that allow purification on 
5 immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity 
purification system (Immunex Corp, Seattle WA). The inclusion of a cleavable linker 
sequences such as Factor Xa or enterokinase (Invitrogen, San Diego CA) between a 
purification domain and the motif-comprising peptide or polypeptide to facilitate purification. 
For example, an expression vector can include an epitope-encoding nucleic acid sequence 

10 linked to six histidine residues followed by a thioredoxin and an enterokinase cleavage site 
(see e.g., Williams (1995) Biochemistry 34:1787-1797; Dobeli (1998) Protein Expr. Purif. 
12:404-414). The histidine residues facilitate detection and purification while the 
enterokinase cleavage site provides a means for purifying the epitope from the remainder of 
the fusion protein. Technology pertaining to vectors encoding fusion proteins and application 

15 of fusion proteins are well described in the scientific and patent literature, see e.g., Kroll 
(1993) DNA Cell. Biol., 12:441-53. 

Transcriptional and translational control sequences 

The invention provides nucleic acid (e.g., DNA) sequences of the invention 
operatively linked to expression (e.g., transcriptional or translational) control sequence(s), 

20 e.g., promoters or enhancers, to direct or modulate RNA synthesis/ expression. The 

expression control sequence can be in an expression vector. Exemplary bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda PR, PL and tip. Exemplary eukaryotic promoters 
include CMV immediate early, HS V thymidine kinase, early and late S V40, LTRs from 
retrovirus, and mouse metallothionein I. 

25 Promoters suitable for expressing a polypeptide in bacteria include the E. coli 

lac or trp promoters, the lad promoter, the lacZ promoter, the T3 promoter, the T7 promoter, 
the gpt promoter, the lambda PR promoter, the lambda PL promoter, promoters from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), and the acid 
phosphatase promoter. Eukaryotic promoters include the CMV immediate early promoter, 

30 the HSV thymidine kinase promoter, heat shock promoters, the early and late SV40 promoter, 
LTRs from retroviruses, and the mouse metallothionein-I promoter. Other promoters known 
to control expression of genes in prokaryotic or eukaryotic cells or their viruses may also be 
used. Promoters suitable for expressing the polypeptide or fragment thereof in bacteria 
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include the E. coli lac or trp promoters, the lad promoter, the lacZ promoter, the T3 
promoter, the 77 promoter, the gpt promoter, the lambda P R promoter, the lambda P L 
promoter, promoters from operons encoding glycolytic enzymes such as 3-phosphoglycerate 
kinase (PGK) and the acid phosphatase promoter. Fungal promoters include the a factor 
5 promoter. Eukaryotic promoters include the CMV immediate early promoter, the HS V 
thymidine kinase promoter, heat shock promoters, the early and late SV40 promoter, LTRs 
from retroviruses and the mouse metallothionein-I promoter. Other promoters known to 
control expression of genes in prokaryotic or eukaryotic cells or their viruses may also be 
used. 

1 0 Tissue-Specific Plant Promoters 

The invention provides expression cassettes that can be expressed in a tissue- 
specific manner, e.g., that can express a xylanase of the invention in a tissue-specific manner. 
The invention also provides plants or seeds that express a xylanase of the invention in a 
tissue-specific manner. The tissue-specificity can be seed specific, stem specific, leaf 

15 specific, root specific, fruit specific and the like. 

In one aspect, a constitutive promoter such as the CaMV 35S promoter can be 
used for expression in specific parts of the plant or seed or throughout the plant For 
example, for overexpression, a plant promoter fragment can be employed which will direct 
expression of a nucleic acid in some or all tissues of a plant, e.g., a regenerated plant. Such 

20 promoters are referred to herein as "constitutive" promoters and are active under most 
environmental conditions and states of development or cell differentiation. Examples of 
constitutive promoters include the cauliflower mosaic virus (CaMV) 35S transcription 
initiation region, the 1- or 2 1 - promoter derived from T-DNA of Agrobacterium tumefaciens, 
and other transcription initiation regions from various plant genes known to those of skill. 

25 Such genes include, e.g.,ACTll from Arabidopsis (Huang (1996) PlantMoL Biol 33:125- 
139); Cat3 from Arabidopsis (GenBankNo. U43147, Zhong (1996) Mol Gen. Genet. 
251 : 196-203); the gene encoding stearoyl-acyl carrier protein desaturase from Brassica napus 
(GenbankNo. X74782, Solocombe (1994) Plant Physiol. 104:1167-1176); GPcl from maize 
(GenBank No. X15596; Martinez (1989) J. Mol Biol 208:551-565); the Gpc2 from maize 

30 (GenBankNo. U45855, Manjunath (1997) Plant Mol Biol 33:97-112); plant promoters 
described in U.S. Patent Nos. 4,962,028; 5,633,440. 

The invention uses tissue-specific or constitutive promoters derived from 
viruses which can include, e.g., the tobamovirus subgenomic promoter (Kumagai (1995) 
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Proc. Natl. Acad. Sci. USA 92:1679-1683; the rice tungro bacilliform viras (RTBV), which 
replicates only in phloem cells in infected rice plants, with its promoter which drives strong 
phloem-specific reporter gene expression; the cassava vein mosaic virus (CVMV) promoter, 
with highest activity in vascular elements, in leaf mesophyll cells, and in root tips (Verdaguer 
5 (1996) Plant Mol. Biol. 31:1129-1139). 

Alternatively, the plant promoter may direct expression of xylanase- 
expressing nucleic acid in a specific tissue, organ or cell type tissue-specific promoters) 
or may be otherwise under more precise environmental or developmental control or under the 
control of an inducible promoter. Examples of environmental conditions that may affect 

10 transcription include anaerobic conditions, elevated temperature, the presence of light, or 
sprayed with chemicals/hormones. For example, the invention incorporates the drought- 
inducible promoter of maize (Busk (1997) supra); the cold, drought, and high salt inducible 
promoter from potato (Kirch (1997) Plant Mol. Biol. 33:897 909). 

Tissue-specific promoters can promote transcription only within a certain time 

15 frame of developmental stage within that tissue. See, e.g., Blazquez (1998) Plant Cell 

10:791-800, characterizing the Arabidopsis LEAFY gene promoter. See also Cardon (1997) 
Plant J 12:367-77, describing the transcription factor SPL3, which recognizes a conserved 
sequence motif in the promoter region of the A. thaliana floral meristem identity gene API; 
and Mandel (1995) Plant Molecular Biology, Vol. 29, pp 995-1004, describing the meristem 

20 promoter eIF4. Tissue specific promoters which are active throughout the life cycle of a 
particular tissue can be used. In one aspect, the nucleic acids of the invention are operably 
linked to a promoter active primarily only in cotton fiber cells. In one aspect, the nucleic 
acids of the invention are operably linked to a promoter active primarily during the stages of 
cotton fiber cell elongation, e.g., as described by Rinehart (1996) supra. The nucleic acids 

25 can be operably linked to the Fbl2A gene promoter to be preferentially expressed in cotton 
fiber cells (Ibid) . See also, John (1997) Proc. Natl. Acad. Sci. USA 89:5769-5773; John, et 
al., U.S. Patent Nos. 5,608,148 and 5,602,321, describing cotton fiber-specific promoters and 
methods for the construction of transgenic cotton plants. Root-specific promoters may also 
be used to express the nucleic acids of the invention. Examples of root-specific promoters 

30 include the promoter from the alcohol dehydrogenase gene (DeLisle (1990) Int. Rev. Cytol. 
1 23 :39-60). Other promoters that can be used to express the nucleic acids of the invention 
include, e.g., ovule-specific, embryo-specific, endosperm-specific, integument-specific, seed 
coat-specific promoters, or some combination thereof; a leaf-specific promoter (see, e.g., 
Busk (1997) Plant J. 11:1285 1295, describing a leaf-specific promoter in maize); the ORF13 
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promoter from Agrobacterium rhizogenes (which exhibits high activity in roots, see, e.g., 
Hansen (1997) supra); a maize pollen specific promoter (see, e.g., Guerrero (1990) Mol. Gen. 
Genet. 224:161 168); a tomato promoter active during fruit ripening, senescence and 
abscission of leaves and, to a lesser extent, of flowers can be used (see, e.g., Blume (1997) 
5 Plant J. 12:73 1 746); a pistil-specific promoter from the potato SK2 gene (see, e.g., Ficker 
(1997) Plant Mol. Biol. 35:425 431); the Blec4 gene from pea, which is active in epidermal 
tissue of vegetative and floral shoot apices of transgenic alfalfa making it a useful tool to 
target the expression of foreign genes to the epidermal layer of actively growing shoots or 
fibers; the ovule-specific BEL1 gene (see, e.g., Reiser (1995) Cell 83:735-742, GenBank No. 

10 U39944); and/or, the promoter in Klee, U.S. Patent No. 5,589,583, describing a plant 

promoter region is capable of conferring high levels of transcription in meristematic tissue 
and/or rapidly dividing cells. 

Alternatively, plant promoters which are inducible upon exposure to plant 
hormones, such as auxins, are used to express the nucleic acids of the invention. For 

15 example, the invention can use the auxin-response elements El promoter fragment (AuxKEs) 
in the soybean {Glycine max L.) (Liu (1 997) Plant Physiol. 1 1 5 :397-407); the auxin- 
responsive Arabidopsis GST6 promoter (also responsive to salicylic acid and hydrogen 
peroxide) (Chen (1996) Plant J. 10: 955-966); the auxin-inducible parC promoter from 
tobacco (Sakai (1996) 37:906-913); a plant biotin response element (Streit (1997) Mol. Plant 

20 Microbe Interact. 1 0:933-937); and, the promoter responsive to the stress hormone abscisic 
acid (Sheen (1996) Science 274:1900-1902). 

The nucleic acids of the invention can also be operably linked to plant 
promoters which are inducible upon exposure to chemicals reagents which can be applied to 
the plant, such as herbicides or antibiotics. For example, the maize In2-2 promoter, activated 

25 by benzenesulfonamide herbicide safeners, can be used (De Veylder (1997) Plant Cell 
Physiol. 38:568-577); application of different herbicide safeners induces distinct gene 
expression patterns, including expression in the root, hydathodes, and the shoot apical 
meristem. Coding sequence can be under the control of, e.g., a tetracycline-inducible 
promoter, e.g., as described with transgenic tobacco plants containing the Avena sativa L. 

30 (oat) arginine decarboxylase gene (Masgrau (1997) Plant J. 1 1:465-473); or, a salicylic 
acid-responsive element (Stange (1997) Plant J. 11:1315-1324). Using chemically- {e.g., 
hormone- or pesticide-) induced promoters, i.e., promoter responsive to a chemical which can 
be applied to the transgenic plant in the field, expression of a polypeptide of the invention can 
be induced at a particular stage of development of the plant. Thus, the invention also 
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provides for transgenic plants containing an inducible gene encoding for polypeptides of the 
invention whose host range is limited to target plant species, such as corn, rice, barley, wheat, 
potato or other crops, inducible at any stage of development of the crop. 

One of skill will recognize that a tissue-specific plant promoter may drive 
5 expression of operably linked sequences in tissues other than the target tissue. Thus, a tissue- 
specific promoter is one that drives expression preferentially in the target tissue or cell type, 
but may also lead to some expression in other tissues as well. 

The nucleic acids of the invention can also be operably linked to plant 
promoters which are inducible upon exposure to chemicals reagents. These reagents include, 

10 e.g., herbicides, synthetic auxins, or antibiotics which can be applied, e.g., sprayed, onto 
transgenic plants. Inducible expression of the xylanase-producing nucleic acids of the 
invention will allow the grower to select plants with the optimal xylanase expression and/or 
activity. The development of plant parts can thus controlled. In this way the invention 
provides the means to facilitate the harvesting of plants and plant parts. For example, in 

1 5 various embodiments, the maize In2-2 promoter, activated by benzenesulfonamide herbicide 
safeners, is used (De Veylder (1997) Plant Cell Physiol. 38:568-577); application of different 
herbicide safeners induces distinct gene expression patterns, including expression in the root, , 
hydathodes, and the shoot apical meristem. Coding sequences of the invention are also under 
the control of a tetracycline-inducible promoter, e.g., as described with transgenic tobacco 

20 plants containing the Arena sativa L. (oat) arginine decarboxylase gene (Masgrau (1 997) 
Plant J. 11:465-473); or, a salicylic acid-responsive element (Stange (1997) Plant J. 
11:1315-1324). 

In some aspects, proper polypeptide expression may require polyadenylation 
region at the 3-end of the coding region. The polyadenylation region can be derived from the 
25 natural gene, from a variety of other plant (or animal or other) genes, or from genes in the 
Agrobacterial T-DNA. 

Expression vectors and cloning vehicles 

The invention provides expression vectors and cloning vehicles comprising 
nucleic acids of the invention, e.g., sequences encoding the xylanases of the invention. 
30 Expression vectors and cloning vehicles of the invention can comprise viral particles, 

baculovirus, phage, plasmids, phagemids, cosmids, fosmids, bacterial artificial chromosomes, 
viral DNA (e.g., vaccinia, adenovirus, foul pox virus, pseudorabies and derivatives of SV40), 
PI -based artificial chromosomes, yeast plasmids, yeast artificial chromosomes, and any other 
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vectors specific for specific hosts of interest (such as bacillus, Aspergillus and yeast). 
Vectors of the invention can include chromosomal, non-chromosomal and synthetic DNA o 
sequences. Large numbers of suitable vectors are known to those of skill in the art, and are 
commercially available. Exemplary vectors are include: bacterial: pQE vectors (Qiagen), 
5 pBluescript plasmids, pNH vectors, (lambda-ZAP vectors (Stratagene); ptrc99a, pKK223-3, 
pDR540, pRTT2T (Pharmacia); Eukaryotic: pXTl, pSG5 (Stratagene), pSVK3, pBPV, 
pMSG, pSVLSV40 (Pharmacia). However, any other plasmid or other vector may be used so 
long as they are replicable and viable in the host. Low copy number or high copy number 
vectors may be employed with the present invention. 
10 The expression vector can comprise a promoter, a ribosome binding site for 

translation initiation and a transcription terminator. The vector may also include appropriate 
sequences for amplifying expression. Mammalian expression vectors can comprise an origin 
of replication, any necessary ribosome binding sites, a polyadenylation site, splice donor and 
acceptor sites, transcriptional termination sequences, and 5' flanking non-transcribed 
15 sequences. In some aspects, DNA sequences derived from the SV40 splice and 

polyadenylation sites may be used to provide the required non-transcribed genetic elements. 

In one aspect, the expression vectors contain one or more selectable marker 
genes to permit selection of host cells containing the vector. Such selectable markers include 
genes encoding dihydrofolate reductase or genes conferring neomycin resistance for 
20 eukaryotic cell culture, genes conferring tetracycline or ampicillin resistance in E. coli, and 
the S. cerevisiae TRP1 gene. Promoter regions can be selected from any desired gene using 
chloramphenicol transferase (CAT) vectors or other vectors with selectable markers. 

Vectors for expressing the polypeptide or fragment thereof in eukaryotic cells 
can also contain enhancers to increase expression levels. Enhancers are cis-acting elements 
25 of DNA, usually from about 10 to about 300 bp in length that act on a promoter to increase its 
transcription. Examples include the SV40 enhancer on the late side of the replication origin 
bp 100 to 270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the 
late side of the replication origin, and the adenovirus enhancers. 

A nucleic acid sequence can be inserted into a vector by a variety of 
30 procedures. In general, the sequence is ligated to the desired position in the vector following 
digestion of the insert and the vector with appropriate restriction endonucleases. 
Alternatively, blunt ends in both the insert and the vector may be ligated. A variety of 
cloning techniques are known in the art, e.g., as described in Ausubel and Sambrook. Such 
procedures and others are deemed to be within the scope of those skilled in the art. 
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The vector can be in the form of a plasmid, a viral particle, or a phage. Other 
vectors include chromosomal, non-chromosomal and synthetic DNA sequences, derivatives 
of SV40; bacterial plasmids, phage DNA, baculovirus, yeast plasmids, vectors derived from 
combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox 
5 virus, and pseudorabies. A variety of cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described by, e.g., Sambrook. 

Particular bacterial vectors which can be used include the commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017), pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden), GEM1 (Promega 
10 Biotec, Madison, WI, USA) pQE70, pQE60, pQE-9 (Qiagen), pDIO, psiX174 pBluescript II 
KS, pNH8A, pNH16a, pNH18A, pNH46A (Stratagene), ptrc99a, pKK223-3, pKK233-3, 
DR540, pRIT5 (Pharmacia), pKK232-8 and pCM7. Particular eukaryotic vectors include 
pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, pBPV, pMSG, and pSVL (Pharmacia). 
However, any other vector may be used as long as it is replicable and viable in the host cell. 
15 The nucleic acids of the invention can be expressed in expression cassettes, 

vectors or viruses and transiently or stably expressed in plant cells and seeds. One exemplary 
transient expression system uses episomal expression systems, e.g., cauliflower mosaic virus 
(CaMV) viral RNA generated in the nucleus by transcription of an episomal mini- 
chromosome containing supercoiled DNA, see, e.g., Covey (1990) Proc. Natl. Acad. Sci. 
20 USA 87:1633-1637. Alternatively, coding sequences, i.e., all or sub-fragments of sequences 
of the invention can be inserted into a plant host cell genome becoming an integral part of the 
host chromosomal DNA. Sense or antisense transcripts can be expressed in this manner. A 
vector comprising the sequences (e.g., promoters or coding regions) from nucleic acids of the 
invention can comprise a marker gene that confers a selectable phenotype on a plant cell or a 
25 seed. For example, the marker may encode biocide resistance, particularly antibiotic 

resistance, such as resistance to kanamycin, G418, bleomycin, hygromycin, or herbicide 
resistance, such as resistance to chlorosulfuron or Basta. 

Expression vectors capable of expressing nucleic acids and proteins in plants 
are well known in the art, and can include, e.g. 9 vectors from Agrobacterium spp., potato 
30 virus X (see, e.g., Angeli (1997) EMBO J. 16:3675-3684), tobacco mosaic virus (see, e.g., 

Casper (1996) Gene 173:69-73), tomato bushy stunt virus (see, e.g., Hillman (1989) Virology 
169:42-50), tobacco etch virus (see, e.g., Dolja (1997) Virology 234:243-252), bean golden 
mosaic virus (see, e.g., Morinaga (1993) Microbiol Immunol. 37:471-476), cauliflower 
mosaic virus (see, e.g., Cecchini (1997) Mol. Plant Microbe Interact. 10:1094-1101), maize 
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Ac/Ds transposable element (see, e.g., Rubin (1997) Mol. Cell. Biol. 17:6294-6302; Kunze 
(1996) Chirr. Top. Microbiol. Immunol. 204:161-194), and the maize suppressor-mutator 
(Spm) transposable element (see, e.g., Schlappi (1996) Plant Mol. Biol. 32:717-725); and 
derivatives thereof. 

In one aspect, the expression vector can have two replication systems to allow 
it to be maintained in two organisms, for example in mammalian or insect cells for expression 
and in a prokaryotic host for cloning and amplification. Furthermore, for integrating 
expression vectors, the expression vector can contain at least one sequence homologous to the 
host cell genome. It can contain two homologous sequences which flank the expression 
construct. The integrating vector can be directed to a specific locus in the host cell by 
selecting the appropriate homologous sequence for inclusion in the vector. Constructs for 
integrating vectors are well known in the art. 

Expression vectors of the invention may also include a selectable marker gene 
to allow for the selection of bacterial strains that have been transformed, e.g., genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers can also include biosynthetic 
genes, such as those in the histidine, tryptophan and leucine biosynthetic pathways. 

The DNA sequence in the expression vector is operatively linked to an 
appropriate expression control sequence^) (promoter) to direct KNA synthesis. Particular 
named bacterial promoters include lad, lacZ, T3, T7 9 gpt, lambda P Ri P L and trp. Eukaryotic 
promoters include CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs 
from retrovirus and mouse metallothionein-L Selection of the appropriate vector and 
promoter is well within the level of ordinary skill in the art. The expression vector also 
contains a ribosome binding site for translation initiation and a transcription terminator. The 
vector may also include appropriate sequences for amplifying expression. Promoter regions 
can be selected from any desired gene using chloramphenicol transferase (CAT) vectors or 
other vectors with selectable markers. In addition, the expression vectors preferably contain 
one or more selectable marker genes to provide a phenotypic trait for selection of transformed 
host cells such as dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, 
or such as tetracycline or ampicillin resistance in E. colt. 

Mammalian expression vectors may also comprise an origin of replication, any 
necessary ribosome binding sites, apolyadenylation site, splice donor and acceptor sites, 
transcriptional termination sequences and 5' flanking nontranscribed sequences. In some 
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aspects, DNA sequences derived from the SV40 splice and polyadenylation sites may be used 
to provide the required nontranscribed genetic elements. 

Vectors for expressing the polypeptide or fragment thereof in eukaryotic cells 
may also contain enhancers to increase expression levels. Enhancers are cis-acting elements 
5 of DNA, usually from about 10 to about 300 bp in length that act on a promoter to increase its 
transcription. Examples include the SV40 enhancer on the late side of the replication origin 
bp 100 to 270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the 
late side of the replication origin and the adenovirus enhancers. 

In addition, the expression vectors typically contain one or more selectable 
1 0 marker genes to permit selection of host cells containing the vector. Such selectable markers 
include genes encoding dihydrofolate reductase or genes conferring neomycin resistance for 
eukaryotic cell culture, genes conferring tetracycline or ampicillin resistance in E. coli and 
the 5. cerevisiae TRP1 gene. 

In some aspects, the nucleic acid encoding one of the polypeptides of Group B 
15 amino acid sequences and sequences substantially identical thereto, or fragments comprising 
at least about 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof 
is assembled in appropriate phase with a leader sequence capable of directing secretion of the 
translated polypeptide or fragment thereof. Optionally, the nucleic acid can encode a fusion 
polypeptide in which one of the polypeptides of Group B amino acid sequences and 
20 sequences substantially identical thereto, or fragments comprising at least 5, 10, 15, 20, 25, 
30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof is fused to heterologous 
peptides or polypeptides, such as N-terminal identification peptides which impart desired 
characteristics, such as increased stability or simplified purification. 

The appropriate DNA sequence may be inserted into the vector by a variety of 
25 procedures. In general, the DNA sequence is ligated to the desired position in the vector 
following digestion of the insert and the vector with appropriate restriction endonucleases. 
Alternatively, blunt ends in both the insert and the vector may be ligated. A variety of 
cloning techniques are disclosed in Ausubel et at Current Protocols in Molecular Biology, 
John Wiley 503 Sons, Inc. 1997 and Sambrook et al 9 Molecular Cloning: A L aboratory Manual 
30 2nd Ed ., Cold Spring Harbor Laboratory Press (1989. Such procedures and others are deemed 
to be within the scope of those skilled in the art. 

The vector may be, for example, in the form of a plasmid, a viral particle, or a 
phage. Other vectors include chromosomal, nonchromosomal and synthetic DNA sequences, 
derivatives of S V40; bacterial plasmids, phage DNA, baculovirus, yeast plasmids, vectors 
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derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, 
adenovirus, fowl pox virus and pseudorabies. A variety of cloning and expression vectors for 
use with prokaryotic and eukaryotic hosts are described by Sambrook, et al , Molecular 
Cloning: A Laboratory Manual. 2nd Ed., Cold Spring Harbor, N.Y., (1989). 

Host cells and transformed cells 

The invention also provides a transformed cell comprising a nucleic acid 
sequence of the invention, e.g., a sequence encoding a xylanase of the invention, or a vector 
of the invention. The host cell may be any of the host cells familiar to those skilled in the art, 
including prokaryotic cells, eukaryotic cells, such as bacterial cells, fungal cells, yeast cells, 
mammalian cells, insect cells, or plant cells. Exemplary bacterial cells include E. colU 
Streptomyces, Bacillus subtilis, Salmonella typhimurium and various species within the 
genera Pseudomonas, Streptomyces, and Staphylococcus. Exemplary insect cells include 
Drosophila S2 and Spodoptera S/9. Exemplary animal cells include CHO, COS or Bowes 
melanoma or any mouse or human cell line. The selection of an appropriate host is within the 
abilities of those skilled in the art. Techniques for transforming a wide variety of higher plant 
species are well known and described in the technical and scientific literature. See, e.g., 
Weising (1988) Ann. Rev. Genet. 22:421-477; U.S. Patent No. 5,750,870. 

The vector can be introduced into the host cells using any of a variety of 
techniques, including transformation, transfection, transduction, viral infection, gene guns, or 
Ti-mediated gene transfer. Particular methods include calcium phosphate transfection, 
DEAE-Dextran mediated transfection, lipofection, or electroporation (Davis, L., Dibner, M., 
Battey, I., Basic Methods in Molecular Biology, (1 986)). 

In one aspect, the nucleic acids or vectors of the invention are introduced into 
the cells for screening, thus, the nucleic acids enter the cells in a manner suitable for 
subsequent expression of the nucleic acid. The method of introduction is largely dictated by 
the targeted cell type. Exemplary methods include CaP0 4 precipitation, liposome fusion, 
lipofection (e.g., LIPOFECTIN™), electroporation, viral infection, etc. The candidate 
nucleic acids may stably integrate into the genome of the host cell (for example, with 
retroviral introduction) or may exist either transiently or stably in the cytoplasm (i.e. through 
the use of traditional plasmids, utilizing standard regulatory sequences, selection markers, 
etc.). As many pharmaceutically important screens require human or model mammalian cell 
targets, retroviral vectors capable of transfecting such targets are can be used. 
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Where appropriate, the engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating promoters, selecting transformants or 
amplifying the genes of the invention. Following transformation of a suitable host strain and 
growth of the host strain to an appropriate cell density, the selected promoter may be induced 
5 by appropriate means (e.g., temperature shift or chemical induction) and the cells may be 
cultured for an additional period to allow them to produce the desired polypeptide or 
fragment thereof. 

Cells can be harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting crude extract is retained for further purification. Microbial cells 

10 employed for expression of proteins can be disrupted by any convenient method, including 
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. Such 
methods are well known to those skilled in the art The expressed polypeptide or fragment 
thereof can be recovered and purified from recombinant cell cultures by methods including 
ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange 

1 5 chromatography, phosphocellulose chromatography, hydrophobic interaction 

chromatography, affinity chromatography, hydroxylapatite chromatography and lectin 
chromatography. Protein refolding steps can be used, as necessary, in completing 
configuration of the polypeptide. If desired, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. 

20 The constructs in host cells can be used in a conventional manner to produce 

the gene product encoded by the recombinant sequence. Depending upon the host employed 
in a recombinant production procedure, the polypeptides produced by host cells containing 
the vector may be glycosylated or may be non-glycosylated. Polypeptides of the invention 
may or may not also include an initial methionine amino acid residue. 

25 Cell-free translation systems can also be employed to produce a polypeptide of 

the invention. Cell-free translation systems can use mRNAs transcribed from a DNA 
construct comprising a promoter operably linked to a nucleic acid encoding the polypeptide 
or fragment thereof. In some aspects, the DNA construct may be linearized prior to 
conducting an in vitro transcription reaction. The transcribed mRNA is then incubated with 

30 an appropriate cell-free translation extract, such as a rabbit reticulocyte extract, to produce 
the desired polypeptide or fragment thereof. 

The expression vectors can contain one or more selectable marker genes to 
provide aphenotypic trait for selection of transformed host cells such as dihydrofolate 
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reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or 
ampicillin resistance in E. coli. 

Host cells containing the polynucleotides of interest, e.g., nucleic acids of the 
invention, can be cultured in conventional nutrient media modified as appropriate for 
5 activating promoters, selecting transformants or amplifying genes. The culture conditions, 
such as temperature, pH and the like, are those previously used with the host cell selected for 
expression and will be apparent to the ordinarily skilled artisan. The clones which are 
identified as having the specified enzyme activity may then be sequenced to identify the 
polynucleotide sequence encoding an enzyme having the enhanced activity. 

10 The invention provides a method for overexpressing a recombinant xylanase 

in a cell comprising expressing a vector comprising a nucleic acid of the invention, e.g., a 
nucleic acid comprising a nucleic acid sequence with at least about 50%, 51%, 52%, 53%, 
54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 
70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 

15 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96% 5 97%, 98%, 99%, or more 
sequence identity to a sequence of Group A nucleic acid sequences over a region of at least 
about 100 residues, wherein the sequence identities are determined by analysis with a 
sequence comparison algorithm or by visual inspection, or, a nucleic acid that hybridizes 
under stringent conditions to a nucleic acid sequence as set forth in Group A nucleic acid 

20 sequences, or a subsequence thereof. The overexpression can be effected by any means, e.g., 
use of a high activity promoter, a dicistronic vector or by gene amplification of the vector. 

The nucleic acids of the invention can be expressed, or overexpressed, in any 
in vitro or in vivo expression system. Any cell culture systems can be employed to express, 
or over-express, recombinant protein, including bacterial, insect, yeast, fungal or mammalian 

25 cultures. Over-expression can be effected by appropriate choice of promoters, enhancers, 
vectors (e.g., use of replicon vectors, dicistronic vectors (see, e.g., Gurtu (1996) Biochem. 
Biophys. Res. Commun. 229:295-8), media, culture systems and the like. In one aspect, gene 
amplification using selection markers, e.g., glutamine synthetase (see, e.g., Sanders (1987) 
Dev. Biol. Stand. 66:55-63), in cell systems are used to overexpress the polypeptides of the 

30 invention. 

Additional details regarding this approach are in the public literature and/or 
are known to the skilled artisan. In a particular non-limiting exemplification, such publicly 
available literature includes EP 0659215 (W0 9403612 Al) (Nevalainen et aL); Lapidot, A., 
Mechaly, A., Shoham, Y., "Overexpression and single-step purification of a thermostable 
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xylanase from Bacillus stearothermophilus T-6," J. Biotechnol. Nov 51 :259-64 (1996); Ltithi, 
E., Jasmat, NJB., Bergquist, P.L., "Xylanase from the extremely thermophilic bacterium 
Caldocellum saccharolyticum: overexpression of the gene in Escherichia coli and 
characterization of the gene product," Appl. Environ. Microbiol. Sep 56:2677-83 (1990); and 
5 Sung, W.L., Luk, C.K., Zahab, D.M., Wakarchuk, W., "Overexpression of the Bacillus 

subtilis and circulans xylanases in Escherichia coli? Protein Expr. Purif. Jun 4:200-6 (1993), 
although these references do not teach the inventive enzymes of the instant application. 

The host cell may be any of the host cells familiar to those skilled in the art, 
including prokaryotic cells, eukaryotic cells, mammalian cells, insect cells, or plant cells. As 

10 representative examples of appropriate hosts, there may be mentioned: bacterial cells, such as 
E. coli, Streptomyces, Bacillus subtilis, Salmonella typhimurium and various species within 
the genera Pseudomonas, Streptomyces and Staphylococcus, fungal cells, such as yeast, 
insect cells such as Drosophila S2 and Spodoptera Sf9 9 animal cells such as CHO, COS or 
Bowes melanoma and adenoviruses. The selection of an appropriate host is within the 

1 5 abilities of those skilled in the art. 

The vector may be introduced into the host cells using any of a variety of 
techniques, including transformation, transfection, transduction, viral infection, gene guns, or 
Ti-mediated gene transfer. Particular methods include calcium phosphate transfection, 
DEAE-Dextran mediated transfection, lipofection, or electroporation (Davis, L., Dibner, M., 

20 Battey, L, Basic Methods in Molecular Biology, (1 986)). 

Where appropriate, the engineered host cells can be cultured in conventional 
nutrient media modified as appropriate for activating promoters, selecting transformants or 
amplifying the genes of the invention. Following transformation of a suitable host strain and 
growth of the host strain to an appropriate cell density, the selected promoter may be induced 

25 by appropriate means (e.g., temperature shift or chemical induction) and the cells may be 
cultured for an additional period to allow them to produce the desired polypeptide or 
fragment thereof. 

Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means and the resulting crude extract is retained for further purification. Microbial 
30 cells employed for expression of proteins can be disrupted by any convenient method, 

including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing agents. 
Such methods are well known to those skilled in the art. The expressed polypeptide or 
fragment thereof can be recovered and purified from recombinant cell cultures by methods 
including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation 
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exchange chromatography, phosphocellulose chromatography, hydrophobic interaction 
chromatography, affinity chromatography, hydroxylapatite chromatography and lectin 
chromatography. Protein refolding steps can be used, as necessary, in completing 
configuration of the polypeptide. If desired, high performance liquid chromatography 
5 (HPLC) can be employed for final purification steps. 

Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 lines 
of monkey kidney fibroblasts (described by Gluzman, Cell, 23:175, 1981) and other cell lines 
capable of expressing proteins from a compatible vector, such as the CI 27, 3T3, CHO, HeLa 

1 0 and BHK cell lines. 

The constructs in host cells can be used in a conventional manner to produce 
the gene product encoded by the recombinant sequence. Depending upon the host employed 
in a recombinant production procedure, the polypeptides produced by host cells containing 
the vector may be glycosylated or may be non-glycosylated. Polypeptides of the invention 

1 5 may or may not also include an initial methionine amino acid residue. 

Alternatively, the polypeptides of Group B amino acid sequences and 
sequences substantially identical thereto, or fragments comprising at least 5, 10, 15, 20, 25, 
30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof can be synthetically produced 
by conventional peptide synthesizers. In other aspects, fragments or portions of the 

20 polypeptides may be employed for producing the corresponding full-length polypeptide by 
peptide synthesis; therefore, the fragments may be employed as intermediates for producing 
the full-length polypeptides. 

Cell-free translation systems can also be employed to produce one of the 
polypeptides of Group B amino acid sequences and sequences substantially identical thereto, 

25 or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive 
amino acids thereof using mRNAs transcribed from a DNA construct comprising a promoter 
operably linked to a nucleic acid encoding the polypeptide or fragment thereof. In some 
aspects, the DNA construct may be linearized prior to conducting an in vitro transcription 
reaction. The transcribed mRNA is then incubated with an appropriate cell-free translation 

30 extract, such as a rabbit reticulocyte extract, to produce the desired polypeptide or fragment 
thereof. 
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Amplification of Nucleic Acids 

In practicing the invention, nucleic acids of the invention and nucleic acids 
encoding the xylanases of the invention, or modified nucleic acids of the invention, can be 
reproduced by amplification. Amplification can also be used to clone or modify the nucleic 
5 acids of the invention. Thus, the invention provides amplification primer sequence pairs for 
amplifying nucleic acids of the invention. One of skill in the art can design amplification 
primer sequence pairs for any part of or the full length of these sequences. 

In one aspect, the invention provides a nucleic acid amplified by a primer pair 
of the invention, e.g., a primer pair as set forth by about the first (the 5') 12, 13, 14, 15, 16, 
10 17, 18, 19, 20, 21,22, 23, 24, or 25 residues of a nucleic acid of the invention, and about the 
first (the 5') 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 residues of the complementary 
strand. 

The invention provides an amplification primer sequence pair for amplifying a 
nucleic acid encoding a polypeptide having a xylanase activity, wherein the primer pair is 

1 5 capable of amplifying a nucleic acid comprising a sequence of the invention, or fragments or 
subsequences thereof. One or each member of the amplification primer sequence pair can 
comprise an oligonucleotide comprising at least about 10 to 50 consecutive bases of the 
sequence, or about 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 consecutive bases 
of the sequence. The invention provides amplification primer pairs, wherein the primer pair 

20 comprises a first member having a sequence as set forth by about the first (the 5') 12, 13, 14, 
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 residues of a nucleic acid of the invention, and a 
second member having a sequence as set forth by about the first (the 5') 12, 13, 14, 15, 16, 
17, 18, 19, 20, 21, 22, 23, 24, or 25 residues of the complementary strand of the first member. 
The invention provides xylanases generated by amplification, e.g., polymerase chain reaction 

25 (PCR), using an amplification primer pair of the invention. The invention provides methods 
of making a xylanase by amplification, e.g., polymerase chain reaction (PCR), using an 
amplification primer pair of the invention. In one aspect, the amplification primer pair 
amplifies a nucleic acid from a library, e.g., a gene library, such as an environmental library. 

. Amplification reactions can also be used to quantify the amount of nucleic 

30 acid in a sample (such as the amount of message in a cell sample), label the nucleic acid (e.g., 
to apply it to an array or a blot), detect the nucleic acid, or quantify the amount of a specific 
nucleic acid in a sample. In one aspect of the invention, message isolated from a cell or a 
cDNA library are amplified. 
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The skilled artisan can select and design suitable oligonucleotide amplification 
primers. Amplification methods are also well known in the art, and include, e.g., polymerase 
chain reaction, PCR (see, e.g., PCR PROTOCOLS, A GUIDE TO METHODS AND 
APPLICATIONS, ed. Innis, Academic Press, N.Y. (1990) and PCR STRATEGIES (1995), 
5 ed. Innis, Academic Press, Inc., N.Y., ligase chain reaction (LCR) (see, e.g., Wu (1989) 
Genomics 4:560; Landegren (1988) Science 241:1077; Barringer (1990) Gene 89:1 17); 
transcription amplification (see, e.g., Kwoh (1989) Proc. Natl. Acad. Sci. USA 86:1 173); 
and, self-sustained sequence replication (see, e.g., Guatelli (1990) Proc. Natl. Acad. Sci. USA 
87:1874); Q Beta replicase amplification (see, e.g., Smith (1997) J. Clin. Microbiol. 35:1477- 
10 1491), automated Q-beta replicase amplification assay (see, e.g., Burg (1996) Mol. Cell. 
Probes 10:257-271) and other RNA polymerase mediated techniques (e.g., NASBA, 
Cangene, Mississauga, Ontario); see also Berger (1987) Methods Enzymol. 152:307-316; 
Sambrook; Ausubel; U.S. Patent Nos. 4,683,195 and 4,683,202; Sooknanan (1995) 
Biotechnology 13:563-564. 

15 Determining the degree of sequence identity 

The invention provides nucleic acids comprising sequences having at least 
about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 
65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 
81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

20 97%, 98%, 99%, or more, or complete (100%) sequence identity to an exemplary nucleic acid 
of the invention (e.g., SEQ ID NO.l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID 
NO:9, SEQ ID NO:l 1, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, 
SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 

25 SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ JD NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, 
SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:73, SEQ ID 
NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, 
SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91, SEQ ID NO:93, SEQ ID NO:95, SEQ ID 

30 NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, SEQ ID 
NO: 1 07, SEQ ID NO: 1 09, SEQ ID NO: 111, SEQ ID NO: 1 1 3, SEQ ID NO: 1 1 5, SEQ ID 
NO. l 17, SEQ ID NO. l 19, SEQ ID NO: 121, SEQ ID NO:123, SEQ ID NO:125, SEQ ID 
NO:127, SEQ ID NO:129, SEQ ID NO:131, SEQ ID NO:133, SEQ ID NO:135, SEQ ID 
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NO:137, SEQ ID NO:139, SEQ ID NO:141, SEQ ID NO:143, SEQ ID NO:145, SEQ ID 
NO.147, SEQ ID NO:149, SEQ ID NO:151, SEQ ID NO:153, SEQ ID NO:155, SEQ ID 
NO:157, SEQ ID NO:199, SEQ ID NO.161, SEQ ID NO:163, SEQ ID NO:165, SEQ ID 
NO:167, SEQ ID NO:169, SEQ ID NO:171, SEQ ID NO:173, SEQ ID NO:175, SEQ ID 
5 NO:177, SEQ ID NO:179, SEQ ID NO:181, SEQ ID NO:183, SEQ ID NO:185, SEQ ID 
NO:187, SEQ ID NO:189, SEQ ID NO:191, SEQ ID NO:193, SEQ ID NO:195, SEQ ID 
NO:197, SEQ ID NO:199, SEQ ID NO:201, SEQ ID NO:203, SEQ ID NO:205, SEQ ID 
NO:207, SEQ ID NO:209, SEQ ID NO:21 1, SEQ ID NO:213, SEQ ID NO:215, SEQ ID 
NO:217, SEQ ID NO:219, SEQ ED NO:221, SEQ ID NO:223, SEQ ID NO:225, SEQ ID 

10 NO:227, SEQ ID NO:229, SEQ ID NO:23 1, SEQ ID NO:233, SEQ ID NO:235, SEQ ID 
NO:237, SEQ ID NO:239, SEQ ID NO:241, SEQ ID NO:243, SEQ ID NO:245, SEQ ID 
NO:247, SEQ ID NO:249, SEQ ID NO:251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID 
NO:257, SEQ ID NO:259, SEQ ID NO:261, SEQ ID NO:263, SEQ ID NO:265, SEQ ID 
NO:267, SEQ JD NO:269, SEQ ID NO:271, SEQ ID NO:273, SEQ ID NO:275, SEQ ID 

15 NO.-277, SEQ ID NO:279, SEQ ID NO:281, SEQ ID NO:283, SEQ ID NO:285, SEQ ID 
NO:287, SEQ ID NO:289, SEQ DO NO:291, SEQ ID NO:293, SEQ ID NO:295, SEQ ID 
NO:297, SEQ ID NO:299, SEQ ID NO:301, SEQ ID NO:303, SEQ ID NO:305, SEQ ID 
NO:307, SEQ ID NO:309, SEQ ID NO:311, SEQ ID NO:313, SEQ ID NO:315, SEQ ID 
NO-.317, SEQ ID NO:319, SEQ ID NO:321, SEQ ID NO:323, SEQ ID NO:325, SEQ ID 

20 NO:327, SEQ ID NO:329, SEQ ID NO:331, SEQ ID NO:333, SEQ ID NO:335, SEQ ID 
NO:337, SEQ ID NO:339, SEQ ID NO:341, SEQ ID NO:343, SEQ ED NO:345, SEQ ID 
NO:347, SEQ ED NO:349, SEQ ID NO:351, SEQ ED NO:353, SEQ ED NO:355, SEQ ED 
NO:357, SEQ ED NO:359, SEQ ID NO:361, SEQ ED NO:363, SEQ ED NO:365, SEQ ID 
N0.367, SEQ ID NO:369, SEQ ED NO:371, SEQ ED NO:373, SEQ ED NO:375, SEQ ID 

25 NO:377 or SEQ ED NO:379) over a region of at least about 50, 75, 1 00, 1 50, 200, 250, 300, 
350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 
1200, 1250, 1300, 1350, 1400, 1450, 1500, 1550 or more, residues. The invention provides 
polypeptides comprising sequences having at least about 50%, 51%, 52%, 53%, 54%, 55%, 
56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 

30 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 
88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete 
(100%) sequence identity to an exemplary polypeptide of the invention. The extent of 
sequence identity (homology) may be determined using any computer program and 
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associated parameters, including those described herein, such as BLAST 2.2.2. or FASTA 
version 3.0t78, with the default parameters. 

The nucleic acid sequences are also referred to as "Group A" nucleic acid 
sequences, which include sequences substantially identical thereto, as well as sequences 
5 homologous to Group A nucleic acid sequences and fragments thereof and sequences 

complementary to all of the preceding sequences. Nucleic acid sequences of the invention 
can comprise at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 
consecutive nucleotides of an exemplary sequence of the invention (e.g., Group A nucleic 
acid sequences) and sequences substantially identical thereto. Homologous sequences and 

10 fragments of Group A nucleic acid sequences and sequences substantially identical thereto, 
refer to a sequence having at least 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, 75%, 70%, 
65%, 60%, 55%, or 50% homology to these sequences. Homology may be determined using 
any of the computer programs and parameters described herein, including FASTA version 
3.0t78 with the default parameters. Homologous sequences also include RNA sequences in 

15 which uridines replace the thymines in the nucleic acid sequences as set forth in the Group A 
nucleic acid sequences. The homologous sequences may be obtained using any of the 
procedures described herein or may result from the correction of a sequencing error. It will be 
appreciated that the nucleic acid sequences as set forth in Group A nucleic acid sequences 
and sequences substantially identical thereto, can be represented in the traditional single 

20 character format (See the inside back cover of Stryer, Lubert. Biochemistry, 3rd Ed., W. H 
Freeman & Co., New York.) or in any other format which records the identity of the 
nucleotides in a sequence. 

Various sequence comparison programs identified elsewhere in this patent 
specification are particularly contemplated for use in this aspect of the invention. Protein and/or 

25 nucleic acid sequence homologies may be evaluated using any of the variety of sequence 
comparison algorithms and programs known in the art. Such algorithms and programs 
include, but are by no means limited to, TBLASTN, BLASTP, FASTA, TFASTA and 
CLUSTALW (Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85(8):2444-2448, 1988; 
Altschul et al 9 J. Mol. Biol. 215(3):403-410, 1990; Thompson et aU Nucleic Acids Res. 

30 22(2):4673-4680, 1994; Higgins et a/., Methods EnzymoL 266:383-402, 1996; Altschul et aL, 
J. Mol. Biol. 215(3):403-410, 1990; Altschul et al, Nature Genetics 3:266-272, 1993). 

Homology or identity is often measured using sequence analysis software (e.g., 
Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin 
Biotechnology Center, 1710 University Avenue, Madison, WI 53705). Such software matches 
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similar sequences by assigning degrees of homology to various deletions, substitutions and other 
modifications. The terms '"homology" and "identity" in the context of two or more nucleic acids 
or polypeptide sequences, refer to two or more sequences or subsequences that are the same or 
have a specified percentage of amino acid residues or nucleotides that are the same when 
5 compared and aligned for maximum correspondence over a comparison window or designated 
region as measured using any number of sequence comparison algorithms or by manual 
alignment and visual inspection. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test and 

10 reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary and sequence algorithm program parameters are designated. Default program 
parameters can be used, or alternative parameters can be designated. The sequence comparison 
algorithm then calculates the percent sequence identities for the test sequences relative to the 
reference sequence, based on the program parameters. 

15 A "comparison window", as used herein, includes reference to a segment of any 

one of the number of contiguous positions selected from the group consisting of from 20 to 600, 
usually about 50 to about 200, more usually about 100 to about 1 50 in which a sequence may be 
compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequence for comparison are well- 

20 known in the art Optimal alignment of sequences for comparison can be conducted, e.g. 9 by the 
local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482, 1981, by the 
homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol 48:443, 1970, by the 
* search for similarity method of person & Lipman, Proc. Natl. Acad. Sci. USA 85:2444, 1988, 
by computerized implementations of these algorithms (GAP, BESTFTT, FASTA and TFASTA 

25 in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., 

Madison, WI), or by manual alignment and visual inspection. Other algorithms for determining 
homology or identity include, for example, in addition to a BLAST program (Basic Local 
Alignment Search Tool at the National Center for Biological Information), ALIGN, AMAS 
(Analysis of Multiply Aligned Sequences), AMPS (Protein Multiple Sequence Alignment), 

30 ASSET (Aligned Segment Statistical Evaluation Tool), BANDS, BESTSCOR, BIOSCAN 
(Biological Sequence Comparative Analysis Node), BLIMPS (BLocks IMProved Searcher), 
FASTA, Intervals & Points, BMB, CLUSTAL V, CLUSTAL W, CONSENSUS, 
LCONSENSUS, WCONSENSUS, Smith-Waterman algorithm, DARWIN, Las Vegas 
algorithm, FNAT (Forced Nucleotide Alignment Tool), Framealign, Framesearch, 
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DYNAMIC, FILTER, FSAP (Fristensky Sequence Analysis Package), GAP (Global 
Alignment Program), GENAL, GIBBS, GenQuest, ISSC (Sensitive Sequence Comparison), 
LALIGN (Local Sequence Alignment), LCP (Local Content Program), MACAW (Multiple 
Alignment Construction & Analysis Workbench), MAP (Multiple Alignment Program), 
5 MBLKP, MBLKN, PIMA (Pattern-Induced Multi-sequence Alignment), SAGA (Sequence 
Alignment by Genetic Algorithm) and WHAT-IF. Such alignment programs can also be used 
to screen genome databases to identify polynucleotide sequences having substantially 
identical sequences. A number of genome databases are available, for example, a substantial 
portion of the human genome is available as part of the Human Genome Sequencing Project (J. 

10 Roach, http://weber.uWashington.ed\i/^iX)ach/hunian_ genome_ progress 2.html) (Gibbs, 1 995). 
At least twenty-one other genomes have already been sequenced, including, for example, M. 
genitalium (Fraser etal. 9 l 995), M. jannaschii (Bult et al , 1 996), H. influenzae (Fleischmann et 
al, 1995), E. coli (Blattner et al, 1997) and yeast (S. cerevisiae) (Mewes et al, 1997) andX>. 
melanogaster (Adams et al, 2000). Significant progress has also been made in sequencing the 

1 5 genomes of model organism, such as mouse, C. elegans and Arabadopsis sp. Several databases 
containing genomic information annotated with some functional information are maintained by 
different organization and are accessible via the internet 

One example of a useful algorithm is BLAST and BLAST 2.0 algorithms, 
which are described in Altschul et al, Nuc. Acids Res. 25:3389-3402, 1977 and Altschul et 

20 al, J. Mol. Biol. 215:403-410, 1990, respectively. Software for performing BLAST analyses 
is publicly available through the National Center for Biotechnology Information. This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 

25 referred to as the neighborhood word score threshold (Altschul et al, supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues; always 

30 >0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. 
Extension of the word hits in each direction are halted when: the cumulative alignment score 
falls off by the quantity X from its maximum achieved value; the cumulative score goes to 
zero or below, due to the accumulation of one or more negative-scoring residue alignments; 
or the end of either sequence is reached. The BLAST algorithm parameters W, T and X 
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determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide 
sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and 
a comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3 and expectations (E) of 1 0 and the BLOSUM62 scoring matrix 
5 (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989) alignments (B) of 50, 
expectation (E) of 10, M=5, N= -4 and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA 90:5873, 
1993). One measure of similarity provided by BLAST algorithm is the smallest sum 
10 probability (P(N)), which provides an indication of the probability by which a match between 
two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 
is considered similar to a references sequence if the smallest sum probability in a comparison 
of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 
less than about 0.01 and most preferably less than about 0.001. 
15 In one aspect, protein and nucleic acid sequence homologies are evaluated 

using the Basic Local Alignment Search Tool ("BLAST") In particular, five specific BLAST 
programs are used to perform the following task: 

(1) BLASTP and BLAST3 compare an amino acid query sequence against 
a protein sequence database; 
20 (2) BLASTN compares a nucleotide query sequence against a nucleotide 

sequence database; 

(3) BLASTX compares the six-frame conceptual translation products of a 
query nucleotide sequence (both strands) against a protein sequence database; 

(4) TBLASTN compares a query protein sequence against a nucleotide 
25 sequence database translated in all six reading frames (both strands); and 

(5) TBLASTX compares the six-frame translations of a nucleotide query 
sequence against the six-frame translations of a nucleotide sequence database. 

The BLAST programs identify homologous sequences by identifying similar 
segments, which are referred to herein as "high-scoring segment pairs," between a query 
30 amino or nucleic acid sequence and a test sequence which is preferably obtained from a 
protein or nucleic acid sequence database. High-scoring segment pairs are preferably 
identified (i.e., aligned) by means of a scoring matrix, many of which are known in the art. 
Preferably, the scoring matrix used is the BLOSUM62 matrix (Gonnet et al, Science 
256:1443-1445, 1992; Henikoff and Henikoff, Proteins 17:49-61, 1993). Less preferably, the 
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PAM or PAM250 matrices may also be used (see, e.g., Schwartz and DayhofF, eds., 1978, 
Matrices for Detecting Distance Relationships: Atlas of Protein Sequence and Structure, 
Washington: National Biomedical Research Foundation). BLAST programs are accessible 
through the U.S. National Library of Medicine. 
5 The parameters vised with the above algorithms may be adapted depending on 

the sequence length and degree of homology studied. In some aspects, the parameters may be 
the default parameters used by the algorithms in the absence of instructions from the user. 

Computer systems and computer program products 

To determine and identify sequence identities, structural homologies, motifs 

10 and the like in silico, a nucleic acid or polypeptide sequence of the invention can be stored, 
recorded, and manipulated on any medium which can be read and accessed by a computer. 

Accordingly, the invention provides computers, computer systems, computer 
readable mediums, computer programs products and the like recorded or stored thereon the 
nucleic acid and polypeptide sequences of the invention. As used herein, the words "recorded" 

1 5 and "stored" refer to a process for storing information on a computer medium. A skilled artisan 
can readily adopt any known methods for recording information on a computer readable 
medium to generate manufactures comprising one or more of the nucleic acid and/or 
polypeptide sequences of the invention. 

The polypeptides of the invention include the polypeptide sequences of Group 

20 B amino acid sequences, the exemplary sequences of the invention, and sequences 
substantially identical thereto, and fragments of any of the preceding sequences. 
Substantially identical, or homologous, polypeptide sequences refer to a polypeptide 
sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 
61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 

25 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 
93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence identity to an 
exemplary sequence of the invention, e.g., a polypeptide sequences of the Group B amino 
acid sequences. 

Homology may be determined using any of the computer programs and 
30 parameters described herein, including FASTA version 3.0t78 with the default parameters or 
with any modified parameters. The homologous sequences may be obtained using any of the 
procedures described herein or may result from the correction of a sequencing error. The 
polypeptide fragments comprise at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 150, 
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200, 250, 300, 350, 400, 450, 500 or more consecutive amino acids of the polypeptides of Group 
B amino acid sequences and sequences substantially identical thereto. It will be appreciated 
that the polypeptide codes as set forth in Group B amino acid sequences and sequences 
substantially identical thereto, can be represented in the traditional single character format or 
5 three letter format (See the inside back cover of Stiyer, Lubert Biochemistry. 3rd Ed,. W. H 
Freeman & Co., New York.) or in any other format which relates the identity of the polypeptides 
in a sequence. 

A nucleic acid or polypeptide sequence of the invention can be stored, recorded 
and manipulated on any medium which can be read and accessed by a computer. As used 

10 herein, the words "recorded" and "stored" refer to a process for storing information on a 

computer medium. A skilled artisan can readily adopt any of the presently known methods for 
recording information on a computer readable medium to generate manufactures comprising one 
or more of the nucleic acid sequences as set forth in Group A nucleic acid sequences and 
sequences substantially identical thereto, one or more of the polypeptide sequences as set forth 

15 in Group B amino acid sequences and sequences substantially identical thereto. Another 

aspect of the invention is a computer readable medium having recorded thereon at least 2, 5, 10, 
15, or 20 or more nucleic acid sequences as set forth in Group A nucleic acid sequences mid 
sequences substantially identical thereto. 

Another aspect of the invention is a computer readable medium having 

20 recorded thereon one or more of the nucleic acid sequences as set forth in Group A nucleic 
acid sequences and sequences substantially identical thereto. Another aspect of the invention 
is a computer readable medium having recorded thereon one or more of the polypeptide 
sequences as set forth in Group B amino acid sequences and sequences substantially identical 
thereto. Another aspect of the invention is a computer readable medium having recorded 

25 thereon at least 2, 5, 10, 15, or 20 or more of the sequences as set forth above. 

Computer readable media include magnetically readable media, optically 
readable media, electronically readable media and magnetic/optical media. For example, the 
computer readable media may be a hard disk, a floppy disk, a magnetic tape, CD-ROM, Digital 
Versatile Disk (DVD), Random Access Memory (RAM), or Read Only Memory (ROM) as well 

30 as other types of other media known to those skilled in the art. 

Aspects of the invention include systems (e.g., internet based systems), 
particularly computer systems which store and manipulate the sequence information described 
herein. One example of a computer system 1 00 is illustrated in block diagram form in Figure 1 . 
As used herein, "a computer system" refers to the hardware components, software components 
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and data storage components used to analyze a nucleotide sequence of a nucleic acid sequence 
as set forth in Group A nucleic acid sequences and sequences substantially identical thereto, 
or a polypeptide sequence as set forth in the Group B amino acid sequences. The computer 
system 100 typically includes a processor for processing, accessing and manipulating the 
5 sequence data. The processor 105 can be any well-known type of central processing unit, such 
as, for example, the Pentium III from Intel Corporation, or similar processor from Sun, 
Motorola, Compaq, AMD or International Business Machines. 

Typically the computer system 100 is a general purpose system that comprises 
the processor 105 and one or more internal data storage components 1 10 for storing data and one 
10 or more data retrieving devices for retrieving the data stored on the data storage components. A 
skilled artisan can readily appreciate that any one of the currently available computer systems 
are suitable. 

In one particular aspect, the computer system 100 includes a processor 105 
connected to a bus which is connected to a main memory 115 (preferably implemented as RAM) 
1 5 and one or more internal data storage devices 1 10, such as a hard drive and/or other computer 
readable media having data recorded thereon. In some aspects, the computer system 100 further 
includes one or more data retrieving device 1 1 8 for reading the data stored on the internal data 
storage devices 110. 

The data retrieving device 118 may represent, for example, a floppy disk drive, a 
20 compact disk drive, a magnetic tape drive, or a modem capable of connection to a remote data 
storage system (e.g. , via the internet) etc. In some aspects, the internal data storage device 1 10 is 
a removable computer readable medium such as a floppy disk, a compact disk, a magnetic tape, 
etc. containing control logic and/or data recorded thereon. The computer system 100 may 
advantageously include or be programmed by appropriate software for reading the control logic 
25 and/or the data from the data storage component once inserted in the data retrieving device. 

The computer system 100 includes a display 120 which is used to display output 
to a computer user. It should also be noted that the computer system 100 can be linked to other 
computer systems 125a-c in a network or wide area network to provide centralized access to the 
computer system 100. 

30 Software for accessing and processing the nucleotide sequences of a nucleic acid 

sequence as set forth in Group A nucleic acid sequences and sequences substantially identical 
thereto, or a polypeptide sequence as set forth in Group B amino acid sequences and sequences 
substantially identical thereto, (such as search tools, compare tools and modeling tools etc.) 
may reside in main memory 115 during execution. 
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In some aspects, the computer system 100 may further comprise a sequence 
comparison algorithm for comparing a nucleic acid sequence as set forth in Group A nucleic 
acid sequences and sequences substantially identical thereto, or a polypeptide sequence as set 
forth in Group B amino acid sequences and sequences substantially identical thereto, stored on 
5 a computer readable medium to a reference nucleotide or polypeptide sequence(s) stored on a 
computer readable medium. A "sequence comparison algorithm" refers to one or more 
programs which are implemented (locally or remotely) on the computer system 100 to compare 
a nucleotide sequence with other nucleotide sequences and/or compounds stored within a data 
storage means. For example, the sequence comparison algorithm may compare the nucleotide 

10 sequences of a nucleic acid sequence as set forth in Group A nucleic acid sequences and 
sequences substantially identical thereto, or a polypeptide sequence as set forth in Group B 
amino acid sequences and sequences substantially identical thereto, stored on a computer 
readable medium to reference sequences stored on a computer readable medium to identify 
homologies or structural motifs. 

1 5 Figure 2 is a flow diagram illustrating one aspect of a process 200 for comparing 

a new nucleotide or protein sequence with a database of sequences in order to determine the 
homology levels between the new sequence and the sequences in the database. The database of 
sequences can be a private database stored within the computer system 100, or a public database 
such as GENBANK that is available through the Internet. 

20 The process 200 begins at a start state 201 and then moves to a state 202 wherein 

the new sequence to be compared is stored to a memory in a computer system 100. As 
discussed above, the memory could be any type of memory, including RAM or an internal 
storage device. 

The process 200 then moves to a state 204 wherein a database of sequences is 
25 opened for analysis and comparison. The process 200 then moves to a state 206 wherein the 
first sequence stored in the database is read into a memory on the computer. A comparison is 
then performed at a state 21 0 to determine if the first sequence is the same as the second 
sequence. It is important to note that this step is not limited to performing an exact comparison 
between the new sequence and the first sequence in the database. Well-known methods are 
30 known to those of skill in the art for comparing two nucleotide or protein sequences, even if 
they are not identical. For example, gaps can be introduced into one sequence in order to raise 
the homology level between the two tested sequences. The parameters that control whether gaps 
or other features are introduced into a sequence during comparison are normally entered by the 
user of the computer system. 
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Once a comparison of the two sequences has been performed at the state 210, a 
determination is made at a decision state 210 whether the two sequences are the same. Of 
course, the term "same" is not limited to sequences that are absolutely identical. Sequences that 
are within the homology parameters entered by the user will be marked as "same" in the process 
5 200. 

If a determination is made that the two sequences are the same, the process 200 
moves to a state 214 wherein the name of the sequence from the database is displayed to the 
user. This state notifies the user that the sequence with the displayed name fulfills the homology 
constraints that were entered. Once the name of the stored sequence is displayed to the user, the 

1 0 process 200 moves to a decision state 21 8 wherein a determination is made whether more 

sequences exist in the database. If no more sequences exist in the database, then the process 200 
terminates at an end state 220. However, if more sequences do exist in the database, then the 
process 200 moves to a state 224 wherein a pointer is moved to the next sequence in the 
database so that it can be compared to the new sequence. In this manner, the new sequence is 

15 aligned and compared with every sequence in the database. 

It should be noted that if a determination had been made at the decision state 212 
that the sequences were not homologous, then the process 200 would move immediately to the 
decision state 21 8 in order to determine if any other sequences were available in the database for 
comparison. 

20 Accordingly, one aspect of the invention is a computer system comprising a 

processor, a data storage device having stored thereon a nucleic acid sequence as set forth in 
Group A nucleic acid sequences and sequences substantially identical thereto, or a 
polypeptide sequence as set forth in Group B amino acid sequences and sequences substantially 
identical thereto, a data storage device having retrievably stored thereon reference nucleotide 

25 sequences or polypeptide sequences to be compared to a nucleic acid sequence as set forth in 
Group A nucleic acid sequences and sequences substantially identical thereto, or a 
polypeptide sequence as set forth in Group B amino acid sequences and sequences substantially 
identical thereto and a sequence comparer for conducting the comparison. The sequence 
comparer may indicate a homology level between the sequences compared or identify 

30 structural motifs in the above described nucleic acid code of Group A nucleic acid sequences 
and sequences substantially identical thereto, or a polypeptide sequence as set forth in Group B 
amino acid sequences and sequences substantially identical thereto, or it may identify 
structural motifs in sequences which are compared to these nucleic acid codes and 
polypeptide codes. In some aspects, the data storage device may have stored thereon the 
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sequences of at least 2, 5, 10, 15, 20, 25, 30 or 40 or more of the nucleic acid sequences as set 
forth in Group A nucleic acid sequences and sequences substantially identical thereto, or the 
polypeptide sequences as set forth in Group B amino acid sequences and sequences 
substantially identical thereto. 
5 Another aspect of the invention is a method for determining the level of 

homology between a nucleic acid sequence as set forth in Group A nucleic acid sequences and 
sequences substantially identical thereto, or a polypeptide sequence as set forth in Group B 
amino acid sequences and sequences substantially identical thereto and a reference nucleotide 
sequence. The method including reading the nucleic acid code or the polypeptide code and the 

1 0 reference nucleotide or polypeptide sequence througfi the use of a computer program which 
determines homology levels and determining homology between the nucleic acid code or 
polypeptide code and the reference nucleotide or polypeptide sequence with the computer 
program. The computer program may be any of a number of computer programs for 
detemiining homology levels, including those specifically enumerated herein, (<?.g., BLAST2N 

1 5 with the default parameters or with any modified parameters). The method may be implemented 
using the computer systems described above. The method may also be performed by reading at 
least 2, 5, 1 0, 15, 20, 25, 30 or 40 or more of the above described nucleic acid sequences as set 
forth in the Group A nucleic acid sequences, or the polypeptide sequences as set forth in the 
Group B amino acid sequences through use of the computer program and determining 

20 homology between the nucleic acid codes or polypeptide codes and reference nucleotide 
sequences or polypeptide sequences. 

Figure 3 is a flow diagram illustrating one aspect of a process 250 in a 
computer for detemaining whether two sequences are homologous. The process 250 begins at 
a start state 252 and then moves to a state 254 wherein a first sequence to be compared is 

25 stored to a memory. The second sequence to be compared is then stored to a memory at a 
state 256. The process 250 then moves to a state 260 wherein the first character in the first 
sequence is read and then to a state 262 wherein the first character of the second sequence is 
read. It should be understood that if the sequence is a nucleotide sequence, then the character 
would normally be either A, T, C, G or U. If the sequence is a protein sequence, then it is 

30 preferably in the single letter amino acid code so that the first and sequence sequences can be 
easily compared. 

A determination is then made at a decision state 264 whether the two 
characters are the same. If they are the same, then the process 250 moves to a state 268 
wherein the next characters in the first and second sequences are read. A determination is 
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then made whether the next characters are the same. If they are, then the process 250 
continues this loop until two characters are not the same. If a determination is made that the 
next two characters are not the same, the process 250 moves to a decision state 274 to 
determine whether there are any more characters either sequence to read. 
5 If there are not any more characters to read, then the process 250 moves to a 

state 276 wherein the level of homology between the first and second sequences is displayed 
to the user. The level of homology is determined by calculating the proportion of characters 
between the sequences that were the same out of the total number of sequences in the first 
sequence. Thus, if every character in a first 100 nucleotide sequence aligned with a every 

10 character in a second sequence, the homology level would be 100%. 

Alternatively, the computer program may be a computer program which 
compares the nucleotide sequences of a nucleic acid sequence as set forth in the invention, to 
one or more reference nucleotide sequences in order to determine whether the nucleic acid code 
of Group A nucleic acid sequences and sequences substantially identical thereto, differs from 

15 a reference nucleic acid sequence at one or more positions. Optionally such a program records 
the length and identity of inserted, deleted or substituted nucleotides with respect to the sequence 
of either the reference polynucleotide or a nucleic acid sequence as set forth in Group A nucleic 
acid sequences and sequences substantially identical thereto. In one aspect, the computer 
program may be a program which determines whether a nucleic acid sequence as set forth in 

20 Group A nucleic acid sequences and sequences substantially identical thereto, contains a 
single nucleotide polymorphism (SNP) with respect to a reference nucleotide sequence. 

Accordingly, another aspect of the invention is a method for determining 
whether a nucleic acid sequence as set forth in Group A nucleic acid sequences and 
sequences substantially identical thereto, differs at one or more nucleotides from a reference 

25 nucleotide sequence comprising the steps of reading the nucleic acid code and the reference 
nucleotide sequence through use of a computer program which identifies differences between 
nucleic acid sequences and identifying differences between the nucleic acid code and the 
reference nucleotide sequence with the computer program. In some aspects, the computer 
program is a program which identifies single nucleotide polymorphisms. The method may be 

30 implemented by the computer systems described above and the method illustrated in Figure 
3. The method may also be performed by reading at least 2, 5, 10, 15, 20, 25, 30, or 40 or more 
of the nucleic acid sequences as set forth in Group A nucleic acid sequences and sequences 
substantially identical thereto and the reference nucleotide sequences through the use of the 
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computer program and identifying differences between the nucleic acid codes and the 
reference nucleotide sequences with the computer program. 

In other aspects the computer based system may further comprise an identifier 
for identifying features within a nucleic acid sequence as set forth in the Group A nucleic acid 
5 sequences or a polypeptide sequence as set forth in Group B amino acid sequences and 
sequences substantially identical thereto. 

An "identifier" refers to one or more programs which identifies certain 
features within a nucleic acid sequence as set forth in Group A nucleic acid sequences and 
sequences substantially identical thereto, or a polypeptide sequence as set forth in Group B 
10 amino acid sequences and sequences substantially identical thereto. In one aspect, the 

identifier may comprise a program which identifies an open reading frame in a nucleic acid 
sequence as set forth in Group A nucleic acid sequences and sequences substantially identical 
thereto. 

Figure 4 is a flow diagram illustrating one aspect of an identifier process 300 

1 5 for detecting the presence of a feature in a sequence. The process 300 begins at a start state 
302 and then moves to a state 304 wherein a first sequence that is to be checked for features 
is stored to a memory 1 15 in the computer system 100. The process 300 then moves to a 
state 306 wherein a database of sequence features is opened. Such a database would include 
a list of each feature's attributes along with the name of the feature. For example, a feature 

20 name could be "Initiation Codon" and the attribute would be "ATG" Another example 

would be the feature name "TAATAA Box" and the feature attribute would be 'TAATAA". 
An example of such a database is produced by the University of Wisconsin Genetics 
Computer Group, Alternatively, the features may be structural polypeptide motifs such as 
alpha helices, beta sheets, or functional polypeptide motifs such as enzymatic active sites, 

25 helix-turn-helix motifs or other motifs known to those skilled in the art. 

Once the database of features is opened at the state 306, the process 300 
moves to a state 308 wherein the first feature is read from the database. A comparison of the 
attribute of the first feature with the first sequence is then made at a state 310. A 
determination is then made at a decision state 316 whether the attribute of the feature was 

30 found in the first sequence. If the attribute was found, then the process 300 moves to a state 
318 wherein the name of the found feature is displayed to the user. 

The process 300 then moves to a decision state 320 wherein a determination is 
made whether move features exist in the database. If no more features do exist, then the 
process 300 terminates at an end state 324. However, if more features do exist in the 
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database, then the process 300 reads the next sequence feature at a state 326 and loops back 
to the state 3 10 wherein the attribute of the next feature is compared against the first 
sequence. It should be noted, that if the feature attribute is not found in the first sequence at 
the decision state 316, the process 300 moves directly to the decision state 320 in order to 
5 determine if any more features exist in the database. 

Accordingly, another aspect of the invention is a method of identifying a 
feature within a nucleic acid sequence as set forth in Group A nucleic acid sequences and 
sequences substantially identical thereto, or a polypeptide sequence as set forth in Group B 
amino acid sequences and sequences substantially identical thereto, comprising reading the 

10 nucleic acid code(s) or polypeptide code(s) through the use of a computer program which 
identifies features therein and identifying features within the nucleic acid code(s) with the 
computer program. In one aspect, computer program comprises a computer program which 
identifies open reading frames. The method may be performed by reading a single sequence 
or at least 2, 5, 10, 15, 20, 25, 30, or 40 of the nucleic acid sequences as set forth in Group A 

15 nucleic acid sequences and sequences substantially identical thereto, or the polypeptide 

sequences as set forth in Group B amino acid sequences and sequences substantially identical 
thereto, through the use of the computer program and identifying features within the nucleic 
acid codes or polypeptide codes with the computer program. 

A nucleic acid sequence as set forth in Group A nucleic acid sequences and 

20 sequences substantially identical thereto, or a polypeptide sequence as set forth in Group B 
amino acid sequences and sequences substantially identical thereto, maybe stored and 
manipulated in a variety of data processor programs in a variety of formats. For example, a 
nucleic acid sequence as set forth in Group A nucleic acid sequences and sequences 
substantially identical thereto, or a polypeptide sequence as set forth in Group B amino acid 

25 sequences and sequences substantially identical thereto, may be stored as text in a word 

processing file, such as Microsoft WORD™ or WORDPERFECT™ or as an ASCII file in a 
variety of database programs familiar to those of skill in the art, such as DB2™, SYBASE™, or 
ORACLE™. In addition, many computer programs and databases may be used as sequence 
comparison algorithms, identifiers, or sources of reference nucleotide sequences or polypeptide 

30 sequences to be compared to a nucleic acid sequence as set forth in Group A nucleic acid 

sequences and sequences substantially identical thereto, or a polypeptide sequence as set forth 
in Group B amino acid sequences and sequences substantially identical thereto. The following 
list is intended not to limit the invention but to provide guidance to programs and databases 
which are useful with the nucleic acid sequences as set forth in Group A nucleic acid 
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sequences and sequences substantially identical thereto, or the polypeptide sequences as set 
forth in Group B amino acid sequences and sequences substantially identical thereto. 

The programs and databases which may be used include, but are not limited to: 
MacPattem (EMBL), DiscoveryBase (Molecular Applications Group), GeneMine (Molecular 
5 Applications Group), Look (Molecular Applications Group), MacLook (Molecular Applications 
Group), BLAST and BLAST2 (NCBI), BLASTN and BLASTX (Altschul et al, J. Mol. Biol. 
215 : 403, 1990), FASTA (Pearson and Lipman, Proc. Natl. Acad. Sci. USA, 85: 2444, 1988), 
FASTDB (Brutlag et al Comp. App. Biosci. 6:237-245, 1990), Catalyst (Molecular Simulations 
Inc.), Catalyst/SHAPE (Molecular Simulations Inc.), Cerius 2 .DB Access (Molecular Simulations 

10 Inc.), HypoGen (Molecular Simulations Inc.), Insight II, (Molecular Simulations Inc.), Discover 
(Molecular Simulations Inc.), CHARMm (Molecular Simulations Inc.), Felix (Molecular 
Simulations Inc.), DelPhi, (Molecular Simulations Inc.), QuanteMM, (Molecular Simulations 
Inc.), Homology (Molecular Simulations Inc.), Modeler (Molecular Simulations Inc.), ISIS 
(Molecular Simulations Inc.), Quanta/Protein Design (Molecular Simulations Inc.), WebLab 

1 5 (Molecular Simulations Inc.), WebLab Diversity Explorer (Molecular Simulations Inc.), Gene 
Explorer (Molecular Simulations Inc.), SeqFold (Molecular Simulations hie), the MDL 
Available Chemicals Directory database, the MDL Drug Data Report database, the 
Comprehensive Medicinal Chemistry database, Derwents's World Drug Index database, the 
BioByteMasterFile database, the Genbank database and the Genseqn database. Many other 

20 programs and data bases would be apparent to one of skill in the art given the present disclosure. 

Motifs which may be detected using the above programs include sequences 
encoding leucine zippers, helix-turn-helix motifs, glycosylation sites, ubiquitination sites, 
alpha helices and beta sheets, signal sequences encoding signal peptides which direct the 
secretion of the encoded proteins, sequences implicated in transcription regulation such as 

25 homeoboxes, acidic stretches, enzymatic active sites, substrate binding sites and enzymatic 
cleavage sites. 

Hybridization of nucleic acids 

The invention provides isolated or recombinant nucleic acids that hybridize 
under stringent conditions to an exemplary sequence of the invention (e.g., SEQ ID NO:l, 
30 SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID 

NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID 
NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, 
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SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID 
NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, 
SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO.77, SEQ ED 
NO:79, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ED NO:89, 
5 SEQ ID NO:91, SEQ ID NO:93, SEQ ED NO:95, SEQ ID NO:97, SEQ ED NO:99, SEQ ED 
NO:101, SEQ ED NO:103, SEQ ID NO:105, SEQ ID NO:107, SEQ ED NO:109, SEQ ED 
NO: 1 1 1, SEQ ID NO: 1 1 3, SEQ ED NO: 1 1 5, SEQ ID NO: 1 17, SEQ ED NO:l 19, SEQ ID 
NO:121, SEQ ED NO:123, SEQ ID NO:125, SEQ ED NO:127, SEQ ED NO:129, SEQ ED 
NO:131, SEQ ED NO:133, SEQ ID NO:135, SEQ ID NO:137, SEQ ID NO:139, SEQ ED 

10 NO:141, SEQ ID NO:143, SEQ ID NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ED 
N0.151, SEQ ID NO.153, SEQ ID NO.155, SEQ ED NO:157, SEQ ID NO:199, SEQ ED 
N0:161, SEQ ID NO:163, SEQ ID NO:165, SEQ ED NO:167, SEQ ID NO:169, SEQ ID 
N0:171, SEQ ED NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID 
NO:181, SEQ ED NO: 183, SEQ ED NO:185, SEQ ID NO: 187, SEQ ID NO:189, SEQ ID 

15 N0:191, SEQ ED NO:193, SEQ ED NO:195, SEQ ID NO:197, SEQ ED NO:199, SEQ ID 
NO:201, SEQ ED NO:203, SEQ ED NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID 
NO:211, SEQ ID NO:213, SEQ ID NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ED 
NO:221, SEQ ID NO:223, SEQ ED NO:225, SEQ ID NO:227, SEQ ED NO:229, SEQ ED 
NO:231, SEQ ED NO:233, SEQ ID NO:235, SEQ ID NO:237, SEQ ED NO:239, SEQ ED 

20 NO:241, SEQ ID NO:243, SEQ ED NO:245, SEQ ED NO:247, SEQ ED NO:249, SEQ ID 
NO:251, SEQ ID NO:253, SEQ ED NO:255, SEQ ED NO:257, SEQ ED NO:259, SEQ ID 
NO:261, SEQ ED NO:263, SEQ ID NO:265, SEQ ID NO:267, SEQ ID NO:269, SEQ ID 
NO:271, SEQ ID NO:273, SEQ ED NO:275, SEQ ID NO:277, SEQ ID N0.279, SEQ ID 
NO:281, SEQ ID NO:283, SEQ ED NO:285, SEQ ID NO:287, SEQ ED NO:289, SEQ ID 

25 NO:291, SEQ ED NO:293, SEQ ED NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ED 
NO:301, SEQ ID NO.303, SEQ ED NO.305, SEQ ID NO:307, SEQ ED NO:309, SEQ ID 
NO:31 1, SEQ ED NO:313, SEQ ID NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ED 
NO:321, SEQ ID NO:323, SEQ ID NO:325, SEQ ED NO:327, SEQ ED NO:329, SEQ ED 
NO:331, SEQ ID NO:333, SEQ ID NO:335, SEQ ED NO:337, SEQ ED NO:339, SEQ ID 

30 NO:341, SEQ ED NO:343, SEQ ED NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ED 
NO:351, SEQ ED NO:353, SEQ ED NO:355, SEQ ID NO:357, SEQ ED NO:359, SEQ ID 
NO:361, SEQ ED NO:363, SEQ ED NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ID 
NO:371, SEQ ID NO:373, SEQ ID NO:375, SEQ ID NO:377 or SEQ ID NO:379). The 
stringent conditions can be highly stringent conditions, medium stringent conditions and/or 
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low stringent conditions, including the high and reduced stringency conditions described 
herein. In one aspect, it is the stringency of the wash conditions that set forth the conditions 
which determine whether a nucleic acid is within the scope of the invention, as discussed 
below. 

5 In alternative aspects, nucleic acids of the invention as defined by their ability 

to hybridize under stringent conditions can be between about five residues and the full length 
of nucleic acid of the invention; e.g., they can be at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 55, 
60, 65, 70, 75, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 
800, 850, 900, 950, 1000, or more, residues in length. Nucleic acids shorter than full length 

10 are also included. These nucleic acids can be useful as, e.g., hybridization probes, labeling 
probes, PCR oligonucleotide probes, £RNA (single or double stranded), antisense or 
sequences encoding antibody binding peptides (epitopes), motifs, active sites and the like. 

In one aspect, nucleic acids of the invention are defined by their ability to 
hybridize under high stringency comprises conditions of about 50% formamide at about 37°C 

15 to 42°C. In one aspect, nucleic acids of the invention are defined by their ability to hybridize 
under reduced stringency comprising conditions in about 35% to 25% formamide at about 
30°C to 35°C. 

Alternatively, nucleic acids of the invention are defined by their ability to 
hybridize under high stringency comprising conditions at 42°C in 50% formamide, 5X SSPE, 

20 0.3% SDS, and a repetitive sequence blocking nucleic acid, such as cot-1 or salmon sperm 
DNA (e.g., 200 n/ml sheared and denatured salmon sperm DNA). In one aspect, nucleic 
acids of the invention are defined by their ability to hybridize under reduced stringency 
conditions comprising 35% formamide at a reduced temperature of 35°C. 

In nucleic acid hybridization reactions, the conditions used to achieve a 

25 particular level of stringency will vary, depending on the nature of the nucleic acids being 
hybridized. For example, the length, degree of complementarity, nucleotide sequence 
composition (e.^., GC v. AT content) and nucleic acid type (e.g., RNA v. DNA) of the 
hybridizing regions of the nucleic acids can be considered in selecting hybridization 
conditions. An additional consideration is whether one of the nucleic acids is immobilized, 

30 for example, on a filter. 

Hybridization may be carried out under conditions of low stringency, moderate 
stringency or high stringency. As an example of nucleic acid hybridization, a polymer 
membrane containing immobilized denatured nucleic acids is first prehybridized for 30 
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minutes at 45°C in a solution consisting of 0.9 M NaCl, 50 mM NaH 2 P0 4 , pH 7.0, 5.0 mM 
Na 2 EDTA, 0.5% SDS, 10X Denhardt ! s and 0.5 mg/ml polyriboadenylic acid. Approximately 
2 X 10 7 cpm (specific activity 4-9 X 10 8 cpm/ug) of 32 P end-labeled oligonucleotide probe 
are then added to the solution. After 12-16 hours of incubation, the membrane is washed for 
5 30 minutes at room temperature in IX SET (150 mM NaCl, 20 mM Tris hydrochloride, pH 
7.8, 1 mM Na 2 EDTA) containing 0.5% SDS, followed by a 30 minute wash in fresh IX SET 
at T m -10°C for the oligonucleotide probe. The membrane is then exposed to auto- 
radiographic film for detection of hybridization signals. 

All of the foregoing hybridizations would be considered to be under conditions 

10 of high stringency. 

Following hybridization, a filter can be washed to remove any non-specifically 
bound detectable probe. The stringency used to wash the filters can also be varied depending 
on the nature of the nucleic acids being hybridized, the length of the nucleic acids being 
hybridized, the degree of complementarity, the nucleotide sequence composition (e.g., GC v. 

15 AT content) and the nucleic acid type (e.g., RNA v. DNA). Examples of progressively 

higher stringency condition washes are as follows: 2X SSC, 0.1% SDS at room temperature 
for 15 minutes (low stringency); 0. IX SSC, 0.5% SDS at room temperature for 30 minutes to 1 
hour (moderate stringency); 0.1X SSC, 0.5% SDS for 15 to 30 minutes at between the 
hybridization temperature and 68°C (high stringency); and 0.15M NaCl for 15 minutes at 72°C 

20 (very high stringency). A final low stringency wash can be conducted in 0.1X SSC at room 
temperature. The examples above are merely illustrative of one set of conditions that can be 
used to wash filters. One of skill in the art would know that there are numerous recipes for 
different stringency washes. Some other examples are given below. 

Nucleic acids which have hybridized to the probe are identified by 

25 autoradiography or other conventional techniques. 

The above procedure may be modified to identify nucleic acids having 
decreasing levels of homology to the probe sequence. For example, to obtain nucleic acids of 
decreasing homology to the detectable probe, less stringent conditions may be used. For 
example, the hybridization temperature maybe decreased in increments of 5°C from 68°C to 

30 42 °C in a hybridization buffer having a Na+ concentration of approximately 1M. Following 
hybridization, the filter may be washed with 2X SSC, 0.5% SDS at the temperature of 
hybridization. These conditions are considered to be "moderate" conditions above 50°C and 
"low" conditions below 50°C. A specific example of "moderate" hybridization conditions is 
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when the above hybridization is conducted at 55°C. A specific example of "low stringency" 
hybridization conditions is when the above hybridization is conducted at 45 °C. 

Alternatively, the hybridization may be carried out in buffers, such as 6X SSC, 
containing formamide at a temperature of 42°C. In this case, the concentration of formamide in 
5 the hybridization buffer may be reduced in 5% increments from 50% to 0% to identify clones 
having decreasing levels of homology to the probe. Following hybridization, the filter may be 
washed with 6X SSC, 0.5% SDS at 50°C. These conditions are considered to be "moderate" 
conditions above 25% formamide and "low 5 ' conditions below 25% formamide. A specific 
example of "moderate" hybridization conditions is when the above hybridization is conducted at 

10 30% formamide. A specific example of "low stringency 5 * hybridization conditions is when the 
above hybridization is conducted at 10% formamide. 

However, the selection of a hybridization format is not critical - it is the 
stringency of the wash conditions that set forth the conditions which determine whether a 
nucleic acid is within the scope of the invention. Wash conditions used to identify nucleic 

15 acids within the scope of the invention include, e.g.: a salt concentration of about 0.02 molar 
at pH 7 and a temperature of at least about 50°C or about 55°C to about 60°C; or, a salt 
concentration of about 0.15 M NaCl at 72°C for about 15 minutes; or, a salt concentration of 
about 0.2X SSC at a temperature of at least about 50°C or about 55°C to about 60°C for 
about 15 to about 20 minutes; or, the hybridization complex is washed twice with a solution 

20 with a salt concentration of about 2X SSC containing 0.1% SDS at room temperature for 1 5 
minutes and then washed twice by 0.1X SSC containing 0.1% SDS at 68oC for 15 minutes; 
or, equivalent conditions. See Sambrook, Tijssen and Ausubel for a description of SSC 
buffer and equivalent conditions. 

These methods may be used to isolate nucleic acids of the invention. For 

25 example, the preceding methods may be used to isolate nucleic acids having a sequence with 
at least about 97%, at least 95%, at least 90%, at least 85%, at least 80%, at least 75%, at least 
70%, at least 65%, at least 60%, at least 55%, or at least 50% homology to a nucleic acid 
sequence selected from the group consisting of one of the sequences of Group A nucleic acid 
sequences and sequences substantially identical thereto, or fragments comprising at least 

30 about 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 consecutive bases 
thereof and the sequences complementary thereto. Homology may be measured using the 
alignment algorithm. For example, the homologous polynucleotides may have a coding 
sequence which is a naturally occurring allelic variant of one of the coding sequences 
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described herein. Such allelic variants may have a substitution, deletion or addition of one or 
more nucleotides whoa compared to the nucleic acids of Group A nucleic acid sequences or 
the sequences complementary thereto. 

Additionally, the above procedures may be used to isolate nucleic acids which 
5 encode polypeptides having at least about 99%, 95%, at least 90%, at least 85%, at least 80%, 
at least 75%, at least 70%, at least 65%, at least 60%, at least 55%, or at least 50% homology 
to a polypeptide having the sequence of one of Group B amino acid sequences and sequences 
substantially identical thereto, or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 
50, 75, 100, or 150 consecutive amino acids thereof as determined using a sequence alignment 
10 algorithm (e.g., such as the FASTA version 3.0t78 algorithm with the default parameters). 

Oligonucleotides probes and methods for using them 

The invention also provides nucleic acid probes that can be used, e.g., for 
identifying nucleic acids encoding a polypeptide with a xylanase activity or fragments thereof 
or for identifying xylanase genes. In one aspect, the probe comprises at least 10 consecutive 

1 5 bases of a nucleic acid of the invention. Alternatively, a probe of the invention can be at least 
about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 
50, 60, 70, 80, 90, 100, 110, 120, 130, 150 or about 10 to 50, about 20 to 60 about 30 to 70, 
consecutive bases of a sequence as set forth in a nucleic acid of the invention. The probes 
identify a nucleic acid by binding and/or hybridization. The probes can be used in arrays of 

20 the invention, see discussion below, including, e.g., capillary arrays. The probes of the 
invention can also be used to isolate other nucleic acids or polypeptides. 

The isolated nucleic acids of Group A nucleic acid sequences and sequences 
substantially identical thereto, the sequences complementary thereto, or a fragment 
comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 

25 consecutive bases of one of the sequences of Group A nucleic acid sequences and sequences 
substantially identical thereto, or the sequences complementary thereto may also be used as 
probes to determine whether a biological sample, such as a soil sample, contains an organism 
having a nucleic acid sequence of the invention or an organism from which the nucleic acid 
was obtained. In such procedures, a biological sample potentially harboring the organism 

30 from which the nucleic acid was isolated is obtained and nucleic acids are obtained from the 
sample. The nucleic acids are contacted with the probe under conditions which permit the 
probe to specifically hybridize to any complementary sequences from which are present 
therein. 
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Where necessary, conditions which permit the probe to specifically hybridize 
to complementary sequences may be determined by placing the probe in contact with 
complementary sequences from samples known to contain the complementary sequence as 
well as control sequences which do not contain the complementary sequence. Hybridization 
5 conditions, such as the salt concentration of the hybridization buffer, the formamide 

concentration of the hybridization buffer, or the hybridization temperature, may be varied to 
identify conditions which allow the probe to hybridize specifically to complementary nucleic 
acids. 

If the sample contains the organism from which the nucleic acid was isolated, 

1 0 specific hybridization of the probe is then detected. Hybridization may be detected by 

labeling the probe with a detectable agent such as a radioactive isotope, a fluorescent dye or 
an enzyme capable of catalyzing the formation of a detectable product. 

Many methods for using the labeled probes to detect the presence of 
complementary nucleic acids in a sample are familiar to those skilled in the art. These 

1 5 include Southern Blots, Northern Blots, colony hybridization procedures and dot blots. 
Protocols for each of these procedures are provided in Ausubel et al. Current Protocols in 
Molecular Biology, John Wiley 503 Sons, Inc. (1 997) and Sambrook et al , Molecular Cloning: 
A Laboratory Manual 2nd Ed., Cold Spring Haibor Laboratory Press (1989. 

Alternatively, more than one probe (at least one of which is capable of 

20 specifically hybridizing to any complementary sequences which are present in the nucleic acid 
sample), may be used in an amplification reaction to determine whether the sample contains 
an organism containing a nucleic acid sequence of the invention (eg., an organism from 
which the nucleic acid was isolated). Typically, the probes comprise oligonucleotides. In 
one aspect, the amplification reaction may comprise a PCR reaction. PCR protocols are 

25 described in Ausubel and Sambrook, supra. Alternatively, the amplification may comprise a 
ligase chain reaction, 3SR, or strand displacement reaction. (See Barany, R, "The Ligase Chain 
Reaction in a PCR World", PCR Methods and Applications 1:5-16, 1991; E. Fahy et ah, "Self- 
sustained Sequence Replication (3SR): An Isothermal Transcription-based Amplification 
System Alternative to PCR", PCR Methods and Applications 1:25-33, 1991; and Walker G.T. et 

30 al. 9 "Strand Displacement Amplification-an Isothermal in vitro DNA Amplification Technique", 
Nucleic Acid Research 20:1691-1696, 1992). In such procedures, the nucleic acids in the 
sample are contacted with the probes, the amplification reaction is performed and any resulting 
amplification product is detected. The amplification product may be detected by performing gel 
electrophoresis on the reaction products and staining the gel with an intercalator such as 
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ethidium bromide. Alternatively, one or more of the probes may be labeled with a radioactive 
isotope and the presence of a radioactive amplification product may be detected by 
autoradiography after gel electrophoresis. 

Probes derived from sequences near the ends of the sequences of Group A 
5 nucleic acid sequences and sequences substantially identical thereto, may also be used in 
chromosome walking procedures to identify clones containing genomic sequences located 
adjacent to the sequences of Group A nucleic acid sequences and sequences substantially 
identical thereto. Such methods allow the isolation of genes which encode additional proteins 
from the host organism. 

10 The isolated nucleic acids of Group A nucleic acid sequences and sequences 

substantially identical thereto, the sequences complementary thereto, or a fragment 
comprising at least 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 
consecutive bases of one of the sequences of Group A nucleic acid sequences and sequences 
substantially identical thereto, or the sequences complementary thereto may be used as 

15 probes to identify and isolate related nucleic acids. In some aspects, the related nucleic acids 
may be cDNAs or genomic DNAs from organisms other than the one from which the nucleic 
acid was isolated. For example, the other organisms may be related organisms. In such 
procedures, a nucleic acid sample is contacted with the probe under conditions which permit 
the probe to specifically hybridize to related sequences. Hybridization of the probe to nucleic 

20 acids from the related organism is then detected using any of the methods described above. 

By varying the stringency of the hybridization conditions used to identify 
nucleic acids, such as cDNAs or genomic DNAs, which hybridize to the detectable probe, 
nucleic acids having different levels of homology to the probe can be identified and isolated. 
Stringency may be varied by conducting the hybridization at varying temperatures below the 

25 melting temperatures of the probes. The melting temperature, T m , is the temperature (under 
defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly 
complementary probe. Very stringent conditions are selected to be equal to or about 5°C lower 
than the T m for a particular probe. The melting temperature of the probe may be calculated 
using the following formulas: 

30 For probes between 14 and 70 nucleotides in length the melting temperature 

(TnO is calculated using the formula: T m =81.5+16.6(log [Na+])+0.41(fraction G+C)-(600/N) 
where N is the length of the probe. 
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If the hybridization is carried out in a solution containing formamide, the melting 
temperature maybe calculated using the equation: T m =81.5+ 16.6(log p^a+])-K).41 (fraction 
Grf CMO.63% formamide)-(600/N) where N is the length of the probe. 

Prehybridization may be carried out in 6X SSC, 5X Denhardt's reagent, 0.5% 
5 SDS, lOOjxg denatured fragmented salmon sperm DNA or 6X SSC, 5X Denhardt's recent, 
0.5% SDS, 100|a.g denatured fragmented salmon sperm DNA, 50% formamide. The formulas 
for SSC and Denhardt's solutions are listed in Sambrook et ah, supra. 

Hybridization is conducted by adding the detectable probe to the 
prehybridization solutions listed above. Where the probe comprises double stranded DNA, it is 
1 0 denatured before addition to the hybridization solution. The filter is contacted with the 

hybridization solution for a sufficient period of time to allow the probe to hybridize to cDNAs or 
genomic DNAs containing sequences complementary thereto or homologous thereto. For 
probes over 200 nucleotides in length, the hybridization may be carried out at 15-25°C below 
the Tm. For shorter probes, such as oligonucleotide probes, the hybridization may be conducted 
15 at 5- 10°C below the T m . Typically, for hybridizations in 6X SSC, the hybridization is conducted 
at approximately 68°C. Usually, for hybridizations in 50% formamide containing solutions, the 
hybridization is conducted at approximately 42°C. 

Inhibiting Expression of Xvlanases 

The invention provides nucleic acids complementary to (e.g., antisense 

20 sequences to) the nucleic acids of the invention, e.g., xylanase-encoding nucleic acids. 
Antisense sequences are capable of inhibiting the transport, splicing or transcription of 
xylanase-encoding genes. The inhibition can be effected through the targeting of genomic 
DNA or messenger RNA. The transcription or function of targeted nucleic acid can be 
inhibited, for example, by hybridization and/or cleavage. One particularly useful set of 

25 inhibitors provided by the present invention includes oligonucleotides which are able to either 
bind xylanase gene or message, in either case preventing or inhibiting the production or 
function of xylanase. The association can be through sequence specific hybridization. 
Another useful class of inhibitors includes oligonucleotides which cause inactivation or 
cleavage of xylanase message. The oligonucleotide can have enzyme activity which causes 

30 such cleavage, such as ribozymes. The oligonucleotide can be chemically modified or 

conjugated to an enzyme or composition capable of cleaving the complementary nucleic acid. 
A pool of many different such oligonucleotides can be screened for those with the desired 
activity. Thus, the invention provides various compositions for the inhibition of xylanase 
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expression on a nucleic acid and/or protein level, e.g., antisense, iRNA and ribozymes 
comprising xylanase sequences of the invention and the anti-xylanase antibodies of the 
invention. 

Inhibition of xylanase expression can have a variety of industrial applications. 
5 For example, inhibition of xylanase expression can slow or prevent spoilage. Spoilage can 
occur when polysaccharides, e.g., structural polysaccharides, are enzymatically degraded. 
This can lead to the deterioration, or rot, of fruits and vegetables. In one aspect, use of 
compositions of the invention that inhibit the expression and/or activity of xylanases, e.g., 
antibodies, antisense oligonucleotides, ribozymes and RNAi, are used to slow or prevent 

10 spoilage. Thus, in one aspect, the invention provides methods and compositions comprising 
application onto a plant or plant product (e.g., a cereal, a grain, a fruit, seed, root, leaf, etc.) 
antibodies, antisense oligonucleotides, ribozymes and RNAi of the invention to slow or 
prevent spoilage. These compositions also can be expressed by the plant (e.g., a transgenic 
plant) or another organism (e.g., a bacterium or other microorganism transformed with a 

15 xylanase gene of the invention). 

The compositions of the invention for the inhibition of xylanase expression 
(e.g., antisense, iRNA, ribozymes, antibodies) can be used as pharmaceutical compositions, 
e.g., as anti-pathogen agents or in other therapies, e.g., as anti-microbials for, e.g., 
Salmonella. 

20 Antisense Oligonucleotides 

The invention provides antisense oligonucleotides capable of binding xylanase 
message which can inhibit xylan hydrolase activity (e.g., catalyzing hydrolysis of internal [3- 
1,4-xylosidic linkages) by targeting mRNA. Strategies for designing antisense 
oligonucleotides are well described in the scientific and patent literature, and the skilled 

25 artisan can design such xylanase oligonucleotides using the novel reagents of the invention. 
For example, gene walking/ RNA mapping protocols to screen for effective antisense 
oligonucleotides are well known in the art, see, e.g., Ho (2000) Methods Enzymol. 314:168- 
183, describing an RNA mapping assay, which is based on.standard molecular techniques to 
provide an easy and reliable method for potent antisense sequence selection. See also Smith 

30 (2000) Eur. J. Pharm. Sci. 1 1:191-198. 

Naturally occurring nucleic acids are used as antisense oligonucleotides. The 
antisense oligonucleotides can be of any length; for example, in alternative aspects, the 
antisense oligonucleotides are between about 5 to 100, about 10 to 80, about 15 to 60, about 
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18 to 40. The optimal length can be determined by routine screening. Theantisense 
oligonucleotides can be present at any concentration. The optimal concentration can be 
determined by routine screening. A wide variety of synthetic, non-naturally occurring 
nucleotide and nucleic acid analogues are known which can address this potential problem. 
5 For example, peptide nucleic acids (PNAs) containing non-ionic backbones, such as N-(2- 
aminoethyl) glycine units can be used. Antisense oligonucleotides having phosphorothioate 
linkages can also be used, as described in WO 97/0321 1; WO 96/39154; Mata (1997) Toxicol 
Appl Pharmacol 144:189-197; Antisense Therapeutics, ed. Agrawal (Humana Press, Totowa, 
N.J., 1996). Antisense oligonucleotides having synthetic DNA backbone analogues provided 

10 by the invention can also include phosphoro-dithioate, methylphosphonate, phosphoramidate, 
alkyl phosphotriester, sulfamate, S'-thioacetal, methylene(methylimino), 3*-N-caibamate, and 
morpholino carbamate nucleic acids, as described above. 

Combinatorial chemistry methodology can be used to create vast numbers of 
oligonucleotides that can be rapidly screened for specific oligonucleotides that have 

15 appropriate binding affinities and specificities toward any target, such as the sense and 
antisense xylanase sequences of the invention (see, e.g., Gold (1995) J. of Biol. Chem. 
270:13581-13584). 

Inhibitory Ribozymes 

The invention provides ribozymes capable of binding xylanase message. 

20 These ribozymes can inhibit xylanase activity by, e.g., targeting iriRNA. Strategies for 

designing ribozymes and selecting the xylanase-specific antisense sequence for targeting are 
well described in the scientific and patent literature, and the skilled artisan can design such 
ribozymes using the novel reagents of the invention. Ribozymes act by binding to a target 
RNA through the target RNA binding portion of a ribozyme which is held in close proximity 

25 to an enzymatic portion of the RNA that cleaves the target RNA. Thus, the ribozyme 

recognizes and binds a target RNA through complementary base-pairing, and once bound to 
the correct site, acts enzymatically to cleave and inactivate the target RNA. Cleavage of a 
target RNA in such a manner will destroy its ability to direct synthesis of an encoded protein 
if the cleavage occurs in the coding sequence. After a ribozyme has bound and cleaved its 

30 RNA target, it can be released from that RNA to bind and cleave new targets repeatedly. 

In some circumstances, the enzymatic nature of a ribozyme can be 
advantageous over other technologies, such as antisense technology (where a nucleic acid 
molecule simply binds to a nucleic acid target to block its transcription, translation or 



103 



WO 03/106654 PCT/US03/19153 

association with another molecule) as the effective concentration of ribozyme necessary to 
effect a therapeutic treatment can be lower than that of an antisense oligonucleotide. This 
potential advantage reflects the ability of the ribozyme to act enzymatically. Thus, a single 
ribozyme molecule is able to cleave many molecules of target RNA. In addition, a ribozyme 
5 is typically a highly specific inhibitor, with the specificity of inhibition depending not only on 
the base pairing mechanism of binding, but also on the mechanism by which the molecule 
inhibits the expression of the RNA to which it binds. That is, the inhibition is caused by 
cleavage of the RNA target and so specificity is defined as the ratio of the rate of cleavage of 
the targeted RNA over the rate of cleavage of non-targeted RNA. This cleavage mechanism 
10 is dependent upon factors additional to those involved in base pairing. Thus, the specificity 
of action of a ribozyme can be greater than that of antisense oligonucleotide binding the same 
RNA site. 

The ribozyme of the invention, e.g., an enzymatic ribozyme RNA molecule, 
can be formed in a hammerhead motif, a hairpin motif, as a hepatitis delta virus motif, a 

1 5 group I intron motif and/or an RNaseP-like RNA in association with an RNA guide sequence. 
Examples of hammerhead motifs are described by, e.g., Rossi (1992) Aids Research and 
Human Retroviruses 8:183; hairpin motifs by Hampel (1989) Biochemistry 28:4929, and 
Hampel (1990) Nuc. Acids Res. 18:299; the hepatitis delta virus motif by Perrotta (1992) 
Biochemistry 31:16; the RNaseP motif by Guerrier-Takada (1983) Cell 35:849; and the group 

20 I intron by Cech U.S. Pat. No. 4,987,071. The recitation of these specific motifs is not 
intended to be limiting. Those skilled in the art will recognize that a ribozyme of the 
invention, e.g., an enzymatic RNA molecule of this invention, can have a specific substrate 
binding site complementary to one or more of the target gene RNA regions. A ribozyme of 
the invention can have a nucleotide sequence within or surrounding that substrate binding site 

25 which imparts an RNA cleaving activity to the molecule. 

RNA interference (RNAi) 

In one aspect, the invention provides an RNA inhibitory molecule, a so-called 
"RNAi" molecule, comprising a xylanase sequence of the invention. The RNAi molecule 
comprises a double-stranded RNA (dsRNA) molecule. The RNAi can inhibit expression of 
30 a xylanase gene. In one aspect, the RNAi is about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 or 
more duplex nucleotides in length. While the invention is not limited by any particular 
mechanism of action, the RNAi can enter a cell and cause the degradation of a single- 
stranded RNA (ssRNA) of similar or identical sequences, including endogenous mRNAs. 
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When a cell is exposed to double-stranded RNA (dsRNA), mRNA from the homologous gene 
is selectively degraded by a process called RNA interference (RNAi). A possible basic 
mechanism behind RNAi is the breaking of a double-stranded RNA (dsRNA) matching a 
specific gene sequence into short pieces called short interfering RNA, which trigger the 
5 degradation of mRNA that matches its sequence. In one aspect, the RNAi's of the invention 
are used in gene-silencing therapeutics, see, e.g., Shuey (2002) Drug Discov. Today 7: 1040- 
1046. In one aspect, the invention provides methods to selectively degrade RNA using the 
RNAi's of the invention. The process may be practiced in vitro, ex vivo or in vivo. In one 
aspect, the RNAi molecules of the invention can be used to generate a loss-of-fiinction 
10 mutation in a cell, an organ or an animal. Methods for making and using RNAi molecules for 
selectively degrade RNA are well known in the art, see, e.g., U.S. Patent No. 6,506,559; 
6,511,824; 6,515,109; 6,489,127. 

Modification of Nucleic Acids 

The invention provides methods of generating variants of the nucleic acids of 

15 the invention, e.g., those encoding a xylanase. These methods can be repeated or used in 
various combinations to generate xylanases having an altered or different activity or an 
altered or different stability from that of a xylanase encoded by the template nucleic acid. 
These methods also can be repeated or used in various combinations, e.g., to generate 
variations in gene/ message expression, message translation or message stability. In another 

20 aspect, the genetic composition of a cell is altered by, e.g., modification of a homologous 
gene ex vivo, followed by its reinsertion into the cell. 

A nucleic acid of the invention can be altered by any means. For example, 
random or stochastic methods, or, non-stochastic, or "directed evolution/' methods, see, e.g., 
U.S. Patent No. 6,361,974. Methods for random mutation of genes are well known in the art, 

25 see, e.g., U.S. Patent No. 5,830,696. For example, mutagens can be used to randomly mutate 
a gene. Mutagens include, e.g., ultraviolet light or gamma irradiation, or a chemical 
mutagen, e.g., mitomycin, nitrous acid, photoactivated psoralens, alone or in combination, to 
induce DNA breaks amenable to repair by recombination. Other chemical mutagens include, 
for example, sodium bisulfite, nitrous acid, hydroxylamine, hydrazine or formic acid. Other 

30 mutagens are analogues of nucleotide precursors, e.g., nitrosoguanidine, 5-bromouracil, 2- 
aminopurine, or acridine. These agents can be added to a PCR reaction in place of the 
nucleotide precursor thereby mutating the sequence. Intercalating agents such as proflavine, 
acriflavine, quinacrine and the like can also be used. 
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Any technique in molecular biology can be used, e.g., random PCR 
mutagenesis, see, e.g., Rice (1992) Proc. Natl. Acad. Sci. USA 89:5467-5471; or, 
combinatorial multiple cassette mutagenesis, see, e.g., Crameri (1995) Biotechniques 18:194- 
196. Alternatively, nucleic acids, e.g., genes, can be reassembled after random, or 
5 "stochastic," fragmentation, see, e.g., U.S. Patent Nos. 6,291,242; 6,287,862; 6,287,861; 

5,955,358; 5,830,721; 5,824,514; 5,811,238; 5,605,793. In alternative aspects, modifications, 
additions or deletions are introduced by error-prone PCR, shuffling, oligonucleotide-directed 
mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, cassette 
mutagenesis, recursive ensemble mutagenesis, exponential ensemble mutagenesis, site- 

10 specific mutagenesis, gene reassembly (e.g., GeneReassembly™, see, e.g., U.S. Patent No. 
6,537,776), gene site saturated mutagenesis (GSSM™), synthetic ligation reassembly (SLR), 
recombination, recursive sequence recombination, phosphothioate-modified DNA 
mutagenesis, uracil-containing template mutagenesis, gapped duplex mutagenesis, point 
mismatch repair mutagenesis, repair-deficient host strain mutagenesis, chemical mutagenesis, 

15 radiogenic mutagenesis, deletion mutagenesis, restriction-selection mutagenesis, restriction- 
purification mutagenesis, artificial gene synthesis, ensemble mutagenesis, chimeric nucleic 
acid multimer creation, and/or a combination of these and other methods. 

The following publications describe a variety of recursive recombination 
procedures and/or methods which can be incorporated into the methods of the invention: 

20 Stemmer (1 999) "Molecular breeding of viruses for targeting and other clinical properties " 
Tumor Targeting 4:1-4; Ness (1999) Nature Biotechnology 17:893-896; Chang (1999) 
"Evolution of a cytokine using DNA family shuffling" Nature Biotechnology 17:793-797; 
Minshull (1999) "Protein evolution by molecular breeding" Current Opinion in Chemical 
Biology 3:284-290; Christians (1999) "Directed evolution of thymidine kinase for AZT 

25 phosphorylation using DNA family shuffling" Nature Biotechnology 1 7:259-264; Crameri 
(1998) "DNA shuffling of a family of genes from diverse species accelerates directed 
evolution" Nature 391:288-291; Crameri (1997) "Molecular evolution of an arsenate 
detoxification pathway by DNA shuffling," Nature Biotechnology 15:436-438; Zhang (1997) 
"Directed evolution of an effective fucosidase from a galactosidase by DNA shuffling and 

30 screening" Proc. Natl. Acad. Sci. USA 94:4504-4509; Patten et al. (1997) "Applications of 
DNA Shuffling to Pharmaceuticals and Vaccines" Current Opinion in Biotechnology 8:724- 
733; Crameri et al. (1996) "Construction and evolution of antibody-phage libraries by DNA 
shuffling" Nature Medicine 2:100-103; Gates et al. (1996) "Affinity selective isolation of 
ligands from peptide libraries through display on a lac repressor "headpiece dimer'" Journal 
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of Molecular Biology 255:373-386; Stemmer (1996) "Sexual PGR and Assembly PCR" In: 
The Encyclopedia of Molecular Biology. VCH Publishers, New York, pp.447-457; Crameri 
and Stemmer (1995) "Combinatorial multiple cassette mutagenesis creates all the 
permutations of mutant and wildtype cassettes" BioTechniques 18:194-195; Stemmer et al. 
5 (1995) "Single-step assembly of a gene and entire plasmid form large numbers of 

oligodeoxyribonucleotides" Gene, 164:49-53; Stemmer (1995) "The Evolution of Molecular 
Computation" Science 270: 1510; Stemmer (1995) "Searching Sequence Space" 
Bio/Technology 13:549-553; Stemmer (1994) "Rapid evolution of a protein in vitro by DNA 
shuffling" Nature 370:389-391; and Stemmer (1994) "DNA shuffling by random 

10 fragmentation and reassembly: In vitro recombination for molecular evolution." Proc. Natl. 
Acad. Sci. USA 91:10747-10751. 

Mutational methods of generating diversity include, for example, site-directed 
mutagenesis (Ling et al. (1997) "Approaches to DNA mutagenesis: an overview" Anal 
Biochem. 254(2): 157-178; Dale et al. (1996) "Oligonucleotide-directed random mutagenesis 

15 using the phosphorothioate method" Methods Mol. Biol. 57:369-374; Smith (1985) "In vitro 
mutagenesis" Ann. Rev. Genet. 19:423-462; Botstein & Shortle (1 985) "Strategies and 
applications of in vitro mutagenesis" Science 229:1 193-1201 ; Carter (1986) "Site-directed 
mutagenesis" Biochem. J. 237:1-7; and Kunkel (1987) "The efficiency of oligonucleotide 
directed mutagenesis" in Nucleic Acids & Molecular Biology (Eckstein, F. and Lilley, D. M. 

20 J. eds., Springer Verlag, Berlin)); mutagenesis using uracil containing templates (Kunkel 
(1985) "Rapid and efficient site-specific mutagenesis without phenotypic selection" Proc. 
Natl. Acad. Sci. USA 82:488-492; Kunkel et al. (1987) "Rapid and efficient site-specific 
mutagenesis without phenotypic selection" Methods in Enzymol. 154, 367-382; and Bass et 
al. (1988) "Mutant Trp repressors with new DNA-binding specificities" Science 242:240- 

25 245); oligonucleotide-directed mutagenesis (Methods in Enzymol. 100: 468-500 (1983); 
Methods in Enzymol. 154: 329-350 (1987); Zoller (1982) "Oligonucleotide-directed 
mutagenesis using M13-derived vectors: an efficient and general procedure for the 
production of point mutations in any DNA fragment" Nucleic Acids Res. 10:6487-6500; 
Zoller & Smith (1983) "Oligonucleotide-directed mutagenesis of DNA fragments cloned into 

30 M13 vectors" Methods in Enzymol. 100:468-500; and Zoller (1987) Oligonucleotide-directed 
mutagenesis: a simple method using two oligonucleotide primers and a single-stranded DNA 
template" Methods in Enzymol. 154:329-350); phosphorothioate-modified DNA mutagenesis 
(Taylor (1985) "The use of phosphorothioate-modified DNA in restriction enzyme reactions 
to prepare nicked DNA" Nucl. Acids Res. 13: 8749-8764; Taylor (1985) "The rapid 
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generation of oligonucleotide-directed mutations at high frequency using phosphorothioate- 
modified DNA" Nucl. Acids Res. 13: 8765-8787 (1985); Nakamaye (1986) "Inhibition of 
restriction endonuclease Nci I cleavage by phosphorothioate groups and its application to 
ohgonucleotide-directed mutagenesis" Nucl. Acids Res. 14: 9679-9698; Sayers (1988) "Y-T 
5 Exonucleases in phosphorothioate-based oligonucleotide-directed mutagenesis" Nucl Acids 
Res. 16:791-802; and Sayers et al. (1988) "Strand specific cleavage of phosphorothioate- 
containing DNA by reaction with restriction endonucleases in the presence of ethidium 
bromide" Nucl. Acids Res. 16: 803-814); mutagenesis using gapped duplex DNA (Kramer et 
al. (1984) "The gapped duplex DNA approach to oligonucleotide-directed mutation 

10 construction" Nucl. Acids Res. 12: 9441-9456; Kramer & Fritz (1987) Methods in Enzymol. 
"Oligonucleotide-directed construction of mutations via gapped duplex DNA" 154:350-367; 
Kramer (1988) "Improved enzymatic in vitro reactions in the gapped duplex DNA approach 
to oligonucleotide-directed construction of mutations" Nucl. Acids Res. 16: 7207; and Fritz 
(1988) "Oligonucleotide-directed construction of mutations: a gapped duplex DNA procedure 

15 without enzymatic reactions in vitro" Nucl. Acids Res. 16: 6987-6999). 

Additional protocols that can be used to practice the invention include point 
mismatch repair (Kramer (1984) "Point Mismatch Repair" Cell 38:879-887), mutagenesis 
using repair-deficient host strains (Carter et al. (1985) "Improved oligonucleotide site- 
directed mutagenesis using M13 vectors" Nucl. Acids Res. 13: 4431-4443; and Carter (1987) 

20 "Improved oligonucleotide-directed mutagenesis using Ml 3 vectors" Methods in Enzymol. 
154: 382-403), deletion mutagenesis (Eghtedarzadeh (1986) "Use of oligonucleotides to 
generate large deletions" Nucl. Acids Res. 14: 5115), restriction-selection and restriction- 
selection and restriction-purification (Wells et al. (1986) "Importance of hydrogen-bond 
formation in stabilizing the transition state of subtilisin" Phil. Trans. R. Soc. Lond. A 317: 

25 415-423), mutagenesis by total gene synthesis (Nambiar et al. (1984) "Total synthesis and 
cloning of a gene coding for the ribonuclease S protein" Science 223: 1299-1301; Sakamar 
and Khorana (1988) "Total synthesis and expression of a gene for the a-subunit of bovine rod 
outer segment guanine nucleotide-binding protein (transducin)" Nucl. Acids Res. 14: 6361- 
6372; Wells et al. (1985) "Cassette mutagenesis: an efficient method for generation of 

30 multiple mutations at defined sites" Gene 34:315-323; and Grundstrom et al. (1985) 

"Oligonucleotide-directed mutagenesis by microscale 'shot-gun % gene synthesis" Nucl. Acids 
Res. 13: 3305-3316), double-strand break repair (Mandecki (1986); Arnold (1993) "Protein 
engineering for unusual environments" Current Opinion in Biotechnology 4:450-455. 
"Oligonucleotide-directed double-strand break repair in plasmids of Escherichia coli: a 
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method for site-specific mutagenesis" Proc. Natl. Acad. Sci. USA, 83:7177-7181). Additional 
details on many of the above methods can be found in Methods in Enzymology Volume 154, 
which also describes useful controls for trouble-shooting problems with various mutagenesis 
methods. 

5 Protocols that can be used to practice the invention are described, e.g., in U.S. 

Patent Nos. 5,605,793 to Stemmer (Feb. 25, 1997), "Methods for In Vitro Recombination;" 
U.S. Pat. No. 5,81 1,238 to Stemmer et al. (Sep. 22, 1998) "Methods for Generating 
Polynucleotides having Desired Characteristics by Iterative Selection and Recombination;" 
U.S. Pat. No. 5,830,721 to Stemmer et al. (Nov. 3, 1998), "DNA Mutagenesis by Random 

10 Fragmentation and Reassembly;" U.S. Pat. No. 5,834,252 to Stemmer, et al. (Nov. 10, 1998) 
"End-Complementary Polymerase Reaction;" U.S. Pat No. 5,837,458 to Minshull, et al. 
(Nov. 17, 1998), "Methods and Compositions for Cellular and Metabolic Engineering; 1 ' WO 
95/22625, Stemmer and Crameri, "Mutagenesis by Random Fragmentation and Reassembly;" 
WO 96/33207 by Stemmer and Lipschutz "End Complementary Polymerase Chain 

15 Reaction;" WO 97/20078 by Stemmer and Crameri "Methods for Generating Polynucleotides 
having Desired Characteristics by Iterative Selection and Recombination;" WO 97/35966 by 
Minshull and Stemmer, "Methods and Compositions for Cellular and Metabolic 
Engineering;" WO 99/41402 by Punnonen et al. "Targeting of Genetic Vaccine Vectors;" 
WO 99/41383 by Punnonen et al. "Antigen Library Immunization;" WO 99/41369 by 

20 Punnonen et al. "Genetic Vaccine Vector Engineering;" WO 99/41368 by Punnonen et al. 
"Optimization of Immunomodulatory Properties of Genetic Vaccines;" EP 752008 by 
Stemmer and Crameri, "DNA Mutagenesis by Random Fragmentation and Reassembly;" EP 
0932670 by Stemmer "Evolving Cellular DNA Uptake by Recursive Sequence 
Recombination;" WO 99/23107 by Stemmer et al., "Modification of Virus Tropism and Host 

25 Range by Viral Genome Shuffling;" WO 99/21979 by Apt et al., "Human Papillomavirus 

Vectors;" WO 98/31837 by del Cardayre et al. "Evolution of Whole Cells and Organisms by 
Recursive Sequence Recombination;" WO 98/27230 by Patten and Stemmer, "Methods and 
Compositions for Polypeptide Engineering;" WO 98/27230 by Stemmer et al., "Methods for 
Optimization of Gene Therapy by Recursive Sequence Shuffling and Selection," WO 

30 00/00632, "Methods for Generating Highly Diverse Libraries," WO 00/09679, "Methods for 
Obtaining in Vitro Recombined Polynucleotide Sequence Banks and Resulting Sequences," 
WO 98/42832 by Arnold et al., "Recombination of Polynucleotide Sequences Using Random 
or Defined Primers," WO 99/29902 by Arnold et al., "Method for Creating Polynucleotide 
and Polypeptide Sequences," WO 98/41653 by Vind, "An in Vitro Method for Construction 
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of a DNA Library," WO 98/41622 by Borchert et al., "Method for Constructing a Library 
Using DNA Shuffling," and WO 98/42727 by Pati and Zarling, "Sequence Alterations using 
Homologous Recombination." 

Protocols that can be used to practice the invention (providing details 

5 regarding various diversity generating methods) are described, e.g., in U.S. Patent application 
serial no. (USSN) 09/407,800, "SHUFFLING OF CODON ALTERED GENES" by Patten et 
al. filed Sep. 28, 1999; "EVOLUTION OF WHOLE CELLS AND ORGANISMS BY 
RECURSIVE SEQUENCE RECOMBDvfATION" by del Cardayre et al., United States Patent 
No. 6,379,964; "OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID 

10 RECOMBINATION" by Crameri et al., United States Patent Nos. 6,319,714; 6,368,861; 
6,376,246; 6,423,542; 6,426,224 and PCT/USOO/01203; "USE OF CODON-VARIED 
OLIGONUCLEOTIDE SYNTHESIS FOR SYNTHETIC SHUFFLING" by Welch et al., 
United States Patent No. 6,436,675; "METHODS FOR MAKING CHARACTER STRINGS, 
POLYNUCLEOTIDES & POLYPEPTIDES HAVING DESIRED CHARACTERISTICS" 

15 by Selifonov et al., filed Jan. 18, 2000, (PCT/US00/01202) and, e.g. "METHODS FOR 

MAKING CHARACTER STRINGS, POLYNUCLEOTIDES & POLYPEPTIDES HAVING 
DESIRED CHARACTERISTICS" by Selifonov et al., filed Jul. 18, 2000 (U.S. Ser. No. 
09/618,579); "METHODS OF POPULATING DATA STRUCTURES FOR USE IN 
EVOLUTIONARY SIMULATIONS" by Selifonov and Stemmer, filed Jan. 18, 2000 

20 (PCT/USOO/01138); and "SINGLE-STRANDED NUCLEIC ACID TEMPLATE- 
MEDIATED RECOMBINATION AND NUCLEIC ACID FRAGMENT ISOLATION" by 
Affholter, filed Sep. 6, 2000 (U.S. Ser. No. 09/656,549); and United States Patent Nos. 
6,177,263; 6,153,410. 

Non-stochastic, or "directed evolution," methods include, e.g., saturation 

25 mutagenesis (GSSM™), synthetic ligation reassembly (SLR), or a combination thereof are 
used to modify the nucleic acids of the invention to generate xylanases with new or altered 
properties (e.g., activity under highly acidic or alkaline conditions, high or low temperatures, 
and the like). Polypeptides encoded by the modified nucleic acids can be screened for an 
activity before testing for xylan hydrolysis or other activity. Any testing modality or protocol 

30 can be used, e.g., using a capillary array platform. See, e.g., U.S. Patent Nos. 6,361,974; 
6,280,926; 5,939,250. 
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Saturation mutagenesis, or, GSSM™ 

In one aspect, codon primers containing a degenerate N,N,G/T sequence are 
used to introduce point mutations into a polynucleotide, e.g., a xylanase or an antibody of the 
invention, so as to generate a set of progeny polypeptides in which a full range of single 
amino acid substitutions is represented at each amino acid position, e.g., an amino acid 
residue in an enzyme active site or ligand binding site targeted to be modified. These 
oligonucleotides can comprise a contiguous first homologous sequence, a degenerate 
N,N,G/T sequence, and, optionally, a second homologous sequence. The downstream 
progeny translational products from the use of such oligonucleotides include all possible 
amino acid changes at each amino acid site along the polypeptide, because the degeneracy of 
the N,N,G/T sequence includes codons for all 20 amino acids. In one aspect, one such 
degenerate oligonucleotide (comprised of, e.g., one degenerate N,N,GT cassette) is used for 
subjecting each original codon in a parental polynucleotide template to a full range of codon 
substitutions. In another aspect, at least two degenerate cassettes are used - either in the 
same oligonucleotide or not, for subjecting at least two original codons in a parental 
polynucleotide template to a full range of codon substitutions. For example, more than one 
N,N,G/T sequence can be contained in one oligonucleotide to introduce amino acid mutations 
at more than one site. This plurality ofN,N,G/T sequences can be directly contiguous, or 
separated by one or more additional nucleotide sequence(s). In another aspect, 
oligonucleotides serviceable for introducing additions and deletions can be used either alone 
or in combination with the codons containing an N,N,G/T sequence, to introduce any 
combination or permutation of amino acid additions, deletions, and/or substitutions. 

In one aspect, simultaneous mutagenesis of two or more contiguous amino 
acid positions is done using an oligonucleotide that contains contiguous N,N,G/T triplets, i.e. 
a degenerate (N,N,G/T)n sequence. In another aspect, degenerate cassettes having less 
degeneracy than the N,N,G/T sequence are used. For example, it may be desirable in some 
instances to use (e.g. in an oligonucleotide) a degenerate triplet sequence comprised of only 
one N, where said N can be in the first second or third position of the triplet. Any other bases 
including any combinations and permutations thereof can be used in the remaining two 
positions of the triplet. Alternatively, it may be desirable in some instances to use (e.g. in an 
oligo) a degenerate N,N,N triplet sequence. 

In one aspect, use of degenerate triplets (e.g., N,N,G/T triplets) allows for 
systematic and easy generation of a full range of possible natural amino acids (for a total of 
20 amino acids) into each and every amino acid position in a polypeptide (in alternative 
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aspects, the methods also include generation of less than all possible substitutions per amino 
acid residue, or codon, position). For example, for a 100 amino acid polypeptide, 2000 
distinct species (i.e. 20 possible amino acids per position X 100 amino acid positions) can be 
generated. Through the use of an oligonucleotide or set of oligonucleotides containing a 
5 degenerate N,N,G/T triplet, 32 individual sequences can code for all 20 possible natural 
amino acids. Thus, in a reaction vessel in which a parental polynucleotide sequence is 
subjected to saturation mutagenesis using at least one such oligonucleotide, there are 
generated 32 distinct progeny polynucleotides encoding 20 distinct polypeptides. In contrast, 
the use of a non-degenerate oligonucleotide in site-directed mutagenesis leads to only one 

10 progeny polypeptide product per reaction vessel. Nondegenerate oligonucleotides can 
optionally be used in combination with degenerate primers disclosed; for example, 
nondegenerate oligonucleotides can be used to generate specific point mutations in a working 
polynucleotide. This provides one means to generate specific silent point mutations, point 
mutations leading to corresponding amino acid changes, and point mutations that cause the 

1 5 generation of stop codons and the corresponding expression of polypeptide fragments. 

In one aspect, each saturation mutagenesis reaction vessel contains 
polynucleotides encoding at least 20 progeny polypeptide (e.g., xylanases) molecules such 
that all 20 natural amino acids are represented at the one specific amino acid position 
corresponding to the codon position mutagenized in the parental polynucleotide (other 

20 aspects use less than all 20 natural combinations). The 32-fold degenerate progeny 

polypeptides generated from each saturation mutagenesis reaction vessel can be subjected to 
clonal amplification (e.g. cloned into a suitable host, e.g., E. cofthost, using, e.g., an 
expression vector) and subjected to expression screening. When an individual progeny 
polypeptide is identified by screening to display a favorable change in property (when 

25 compared to the parental polypeptide, such as increased xylan hydrolysis activity under 

alkaline or acidic conditions), it can be sequenced to identify the correspondingly favorable 
amino acid substitution contained therein. 

In one aspect, upon mutagenizing each and every amino acid position in a 
parental polypeptide using saturation mutagenesis as disclosed herein, favorable amino acid 

30 changes may be identified at more than one amino acid position. One or more new progeny 
molecules can be generated that contain a combination of all or part of these favorable amino 
acid substitutions. For example, if 2 specific favorable amino acid changes are identified in 
each of 3 amino acid positions in a polypeptide, the permutations include 3 possibilities at 
each position (no change from the original amino acid, and each of two favorable changes) 
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and 3 positions. Thus, there are 3 x 3 x 3 or 27 total possibilities, including 7 that were 
previously examined - 6 single point mutations (i.e. 2 at each of three positions) and no 
change at any position. 

In yet another aspect, site-saturation mutagenesis can be used together with 
5 shuffling, chimerization, recombination and other mutagenizing processes, along with 
screening. This invention provides for the use of any mutagenizing processes), including 
saturation mutagenesis, in an iterative manner. In one exemplification, the iterative use of 
any mutagenizing process(es) is used in combination with screening. 

The invention also provides for the use of proprietary codon primers 

1 0 (containing a degenerate N,N,N sequence) to introduce point mutations into a polynucleotide, 
so as to generate a set of progeny polypeptides in which a full range of single amino acid 
substitutions is represented at each amino acid position (gene site saturated mutagenesis 
(GSSM™)). The oligos used are comprised contiguously of a first homologous sequence, a 
degenerate N,N,N sequence and preferably but not necessarily a second homologous 

15 sequence. The downstream progeny translational products from the use of such oligos 
include all possible amino acid changes at each amino acid site along the polypeptide, 
because the degeneracy of the N,N,N sequence includes codons for all 20 amino acids. 

In one aspect, one such degenerate oligo (comprised of one degenerate N,N,N 
cassette) is used for subjecting each original codon in a parental polynucleotide template to a 

20 full range of codon substitutions. In another aspect, at least two degenerate N,NJM cassettes 
are used — either in the same oligo or not, for subjecting at least two original codons in a 
parental polynucleotide template to a full range of codon substitutions. Thus, more than one 
N,N,N sequence can be contained in one oligo to introduce amino acid mutations at more 
than one site. This plurality of N,N,N sequences can be directly contiguous, or separated by 

25 one or more additional nucleotide sequence(s). In another aspect, oligos serviceable for 
introducing additions and deletions can be used either alone or in combination with the 
codons containing an N,N,N sequence, to introduce any combination or permutation of amino 
acid additions, deletions and/or substitutions. 

In a particular exemplification, it is possible to simultaneously mutagenize two 

30 or more contiguous amino acid positions using an oligo that contains contiguous NJNf,N 
triplets, i.e. a degenerate (N,N,N)n sequence. 

In another aspect, the present invention provides for the use of degenerate 
cassettes having less degeneracy than the N,N,N sequence. For example, it may be desirable 
in some instances to use (e.g. in an oligo) a degenerate triplet sequence comprised of only one 
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N, where the N can be in the first second or third position of the triplet. Any other bases 
including any combinations and permutations thereof can be used in the remaining two 
positions of the triplet. Alternatively, it may be desirable in some instances to use (e.g., in an 
oligo) a degenerate N,N,N triplet sequence, N,N,G/T, or an N,N, G/C triplet sequence. 
5 It is appreciated, however, that the use of a degenerate triplet (such as 

N,N,G/T or an N,N, G/C triplet sequence) as disclosed in the instant invention is 
advantageous for several reasons. In one aspect, this invention provides a means to 
systematically and fairly easily generate the substitution of the full range of possible amino 
acids (for a total of 20 amino acids) into each and every amino acid position in a polypeptide. 

10 Thus, for a 100 amino acid polypeptide, the invention provides a way to systematically and 
fairly easily generate 2000 distinct species (i.e., 20 possible amino acids per position times 
100 amino acid positions). It is appreciated that there is provided, through the use of an oligo 
containing a degenerate N,N,G/T or an N,N, G/C triplet sequence, 32 individual sequences 
that code for 20 possible amino acids. Thus, in a reaction vessel in which a parental 

15 polynucleotide sequence is subjected to saturation mutagenesis using one such oligo, there 
are generated 32 distinct progeny polynucleotides encoding 20 distinct polypeptides. In 
contrast, the use of a non-degenerate oligo in site-directed mutagenesis leads to only one 
progeny polypeptide product per reaction vessel. 

This invention also provides for the use of nondegenerate oligos, which can 

20 optionally be used in combination with degenerate primers disclosed. It is appreciated that in 
some situations, it is advantageous to use nondegenerate oligos to generate specific point 
mutations in a working polynucleotide. This provides a means to generate specific silent 
point mutations, point mutations leading to corresponding amino acid changes and point 
mutations that cause the generation of stop codons and the corresponding expression of 

25 polypeptide fragments. 

Thus, in one aspect of this invention, each saturation mutagenesis reaction 
vessel contains polynucleotides encoding at least 20 progeny polypeptide molecules such that 
all 20 amino acids are represented at the one specific amino acid position corresponding to 
the codon position mutagenized in the parental polynucleotide. The 32-fold degenerate 

30 progeny polypeptides generated from each saturation mutagenesis reaction vessel can be 

subjected to clonal amplification (e.g., cloned into a suitable E. coli host using an expression 
vector) and subjected to expression screening. When an individual progeny polypeptide is 
identified by screening to display a favorable change in property (when compared to the 
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parental polypeptide), it can be sequenced to identify the correspondingly favorable amino 
acid substitution contained therein. 

It is appreciated that upon mutagenizing each and every amino acid position in 
a parental polypeptide using saturation mutagenesis as disclosed herein, favorable amino acid 
5 changes may be identified at more than one amino acid position. One or more new progeny 
molecules can be generated that contain a combination of all or part of these favorable amino 
acid substitutions. For example, if 2 specific favorable amino acid changes are identified in 
each of 3 amino acid positions in a polypeptide, the permutations include 3 possibilities at 
each position (no change from the original amino acid and each of two favorable changes) 

10 and 3 positions. Thus, there are 3 x 3 x 3 or 27 total possibilities, including 7 that were 
previously examined - 6 single point mutations (Le., 2 at each of three positions) and no 
change at any position. 

Thus, in a non-limiting exemplification, this invention provides for the use of 
saturation mutagenesis in combination with additional mutagenization processes, such as 

15 process where two or more related polynucleotides are introduced into a suitable host cell 
such that a hybrid polynucleotide is generated by recombination and reductive reassortment. 

In addition to performing mutagenesis along the entire sequence of a gene, the 
instant invention provides that mutagenesis can be use to replace each of any number of bases 
in a polynucleotide sequence, wherein the number of bases to be mutagenized is preferably 

20 every integer from 15 to 100,000. Thus, instead of mutagenizing every position along a 
molecule, one can subject every or a discrete number of bases (preferably a subset totaling 
from 15 to 100,000) to mutagenesis. Preferably, a separate nucleotide is used for 
mutagenizing each position or group of positions along a polynucleotide sequence. A group 
of 3 positions to be mutagenized may be a codon. The mutations are preferably introduced 

25 using a mutagenic primer, containing a heterologous cassette, also referred to as a mutagenic 
cassette. Exemplary cassettes can have from 1 to 500 bases. Each nucleotide position in 
such heterologous cassettes be N, A, C, G, T, A/C, A/G, A/T, C/G, C/T, G/T, C/G/T, A/G/T, 
A/C/T, A/C/G, or E, where E is any base that is not A, C, G, or T (E can be referred to as a 
designer oligo). 

30 In a general sense, saturation mutagenesis is comprised of mutagenizing a 

complete set of mutagenic cassettes (wherein each cassette is preferably about 1-500 bases in 
length) in defined polynucleotide sequence to be mutagenized (wherein the sequence to be 
mutagenized is preferably from about 15 to 100,000 bases in length). Thus, a group of 
mutations (ranging from 1 to 100 mutations) is introduced into each cassette to be 
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mutagenized. A grouping of mutations to be introduced into one cassette can be different or 
the same from a second grouping of mutations to be introduced into a second cassette during 
the application of one round of saturation mutagenesis. Such groupings are exemplified by 
deletions, additions, groupings of particular codons and groupings of particular nucleotide 
5 cassettes. 

Defined sequences to be mutagenized include a whole gene, pathway, cDNA, 
an entire open reading frame (ORF) and entire promoter, enhancer, repressor/transactivator, 
origin of replication, intron, operator, or any polynucleotide functional group. Generally, a 
"defined sequences" for this purpose may be any polynucleotide that a 15 base- 
10 polynucleotide sequence and polynucleotide sequences of lengths between 15 bases and 

1 5,000 bases (this invention specifically names every integer in between). Considerations in 
choosing groupings of codons include types of amino acids encoded by a degenerate 
mutagenic cassette. 

In one exemplification a grouping of mutations that can be introduced into a 
15 mutagenic cassette, this invention specifically provides for degenerate codon substitutions 
(using degenerate oligos) that code for 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 
19 and 20 amino acids at each position and a library of polypeptides encoded thereby. 

Synthetic Ligation Reassembly (SLR) 

The invention provides a non-stochastic gene modification system termed 

20 "synthetic ligation reassembly," or simply "SLR," a "directed evolution process," to generate 
polypeptides, e.g., xylanases or antibodies of the invention, with new or altered properties. 

SLR is a method of ligating oligonucleotide fragments together non- 
stochastically. This method differs from stochastic oligonucleotide shuffling in that the 
nucleic acid building blocks are not shuffled, concatenated or chimerized randomly, but 

25 rather are assembled non-stochastically. See, e.g., U.S. Patent Application Serial No. 

(USSN) 09/332,835 entitled "Synthetic Ligation Reassembly in Directed Evolution" and filed 
on June 14, 1999 ("USSN 09/332,835"). In one aspect, SLR comprises the following steps: 
(a) providing a template polynucleotide, wherein the template polynucleotide comprises 
sequence encoding a homologous gene; (b) providing a plurality of building block 

30 polynucleotides, wherein the building block polynucleotides are designed to cross-over 
reassemble with the template polynucleotide at a predetermined sequence, and a building 
block polynucleotide comprises a sequence that is a variant of the homologous gene and a 
sequence homologous to the template polynucleotide flanking the variant sequence; (c) 
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combining a building block polynucleotide with a template polynucleotide such that the 
building block polynucleotide cross-over reassembles with the template polynucleotide to 
generate polynucleotides comprising homologous gene sequence variations. 

SLR does not depend on the presence of high levels of homology between 
5 polynucleotides to be rearranged. Thus, this method can be used to non-stochastically 

generate libraries (or sets) of progeny molecules comprised of over 10 100 different chimeras. 
SLR can be used to generate libraries comprised of over 10 1000 different progeny chimeras. 
Thus, aspects of the present invention include non-stochastic methods of producing a set of 
finalized chimeric nucleic acid molecule shaving an overall assembly order that is chosen by 

1 0 design. This method includes the steps of generating by design a plurality of specific nucleic 
acid building blocks having serviceable mutually compatible ligatable ends, and assembling 
these nucleic acid building blocks, such that a designed overall assembly order is achieved. 

The mutually compatible ligatable ends of the nucleic acid building blocks to 
be assembled are considered to be "serviceable" for this type of ordered assembly if they 

1 5 enable the building blocks to be coupled in predetermined orders. Thus, the overall assembly 
order in which the nucleic acid building blocks can be coupled is specified by the design of 
the ligatable ends. If more than one assembly step is to be used, then the overall assembly 
order in which the nucleic acid building blocks can be coupled is also specified by the 
sequential order of the assembly step(s). In one aspect, the annealed building pieces are 

20 treated with an enzyme, such as a ligase (e.g. T4 DNA ligase), to achieve covalent bonding of 
the building pieces. 

In one aspect, the design of the oligonucleotide building blocks is obtained by 
analyzing a set of progenitor nucleic acid sequence templates that serve as a basis for 
producing a progeny set of finalized chimeric polynucleotides. These parental 

25 oligonucleotide templates thus serve as a source of sequence information that aids in the 
design of the nucleic acid building blocks that are to be mutagenized, e.g., chimerized or 
shuffled. In one aspect of this method, the sequences of a plurality of parental nucleic acid 
templates are aligned in order to select one or more demarcation points. The demarcation 
points can be located at an area of homology, and are comprised of one or more nucleotides. 

30 These demarcation points are preferably shared by at least two of the progenitor templates. 
The demarcation points can thereby be used to delineate the boundaries of oligonucleotide 
building blocks to be generated in order to rearrange the parental polynucleotides. The 
demarcation points identified and selected in the progenitor molecules serve as potential 
chimerization points in the assembly of the final chimeric progeny molecules. A demarcation 
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point can be an area of homology (comprised of at least one homologous nucleotide base) 
shared by at least two parental polynucleotide sequences. Alternatively, a demarcation point 
can be an area of homology that is shared by at least half of the parental polynucleotide 
sequences, or, it can be an area of homology that is shared by at least two thirds of the 
5 parental polynucleotide sequences. Even more preferably a serviceable demarcation points is 
an area of homology that is shared by at least three fourths of the parental polynucleotide 
sequences, or, it can be shared by at almost all of the parental polynucleotide sequences. In 
one aspect, a demarcation point is an area of homology that is shared by all of the parental 
polynucleotide sequences. 

10 In one aspect, a ligation reassembly process is performed exhaustively in order 

to generate an exhaustive library of progeny chimeric polynucleotides. In other words, all 
possible ordered combinations of the nucleic acid building blocks are represented in the set of 
finalized chimeric nucleic acid molecules. At the same time, in another aspect, the assembly 
order (i.e. the order of assembly of each building block in the 5 5 to 3 sequence of each 

15 finalized chimeric nucleic acid) in each combination is by design (or non-stochastic) as 
described above. Because of the non-stochastic nature of this invention, the possibility of 
unwanted side products is greatly reduced. 

In another aspect, the ligation reassembly method is performed systematically. 
For example, the method is performed in order to generate a systematically 

20 compartmentalized library of progeny molecules, with compartments that can be screened 
systematically, e.g. one by one. In other words this invention provides that, through the 
selective and judicious use of specific nucleic acid building blocks, coupled with the selective 
and judicious use of sequentially stepped assembly reactions, a design can be achieved where 
specific sets of progeny products are made in each of several reaction vessels. This allows a 

25 systematic examination and screening procedure to be performed. Thus, these methods allow 
a potentially very large number of progeny molecules to be examined systematically in 
smaller groups. Because of its ability to perform chimerizations in a manner that is highly 
flexible yet exhaustive and systematic as well, particularly when there is a low level of 
homology among the progenitor molecules, these methods provide for the generation of a 

30 library (or set) comprised of a large number of progeny molecules. Because of the non- 
stochastic nature of the instant ligation reassembly invention, the progeny molecules 
generated preferably comprise a library of finalized chimeric nucleic acid molecules having 
an overall assembly order that is chosen by design. The saturation mutagenesis and 
optimized directed evolution methods also can be used to generate different progeny 
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molecular species. It is appreciated that the invention provides freedom of choice and control 
regarding the selection of demarcation points, the size and number of the nucleic acid 
building blocks, and the size and design of the couplings. It is appreciated, furthermore, that 
the requirement for intermolecular homology is highly relaxed for the operability of this 
5 invention. In fact, demarcation points can even be chosen in areas of little or no 

intermolecular homology. For example, because of codon wobble, i.e. the degeneracy of 
codons, nucleotide substitutions can be introduced into nucleic acid building blocks without 
altering the amino acid originally encoded in the corresponding progenitor template. 
Alternatively, a codon can be altered such that the coding for an originally amino acid is 
10 altered. This invention provides that such substitutions can be introduced into the nucleic 
acid building block in order to increase the incidence of intermolecular homologous 
demarcation points and thus to allow an increased number of couplings to be achieved among 
the building blocks, which in turn allows a greater number of progeny chimeric molecules to 
be generated. 

15 Synthetic gene reassembly 

In one aspect, the present invention provides a non-stochastic method termed 
synthetic gene reassembly (e.g., GeneReassembly™, see, e.g., U.S. Patent No. 6,537,776), 
which differs from stochastic shuffling in that the nucleic acid building blocks are not 
shuffled or concatenated or chimerized randomly, but rather are assembled non- 
20 stochastically. 

The synthetic gene reassembly method does not depend on the presence of a 
high level of homology between polynucleotides to be shuffled. The invention can be used to 
non-stochastically generate libraries (or sets) of progeny molecules comprised of over 10 100 
different chimeras. Conceivably, synthetic gene reassembly can even be used to generate 
25 libraries comprised of over 10 1000 different progeny chimeras. 

Thus, in one aspect, the invention provides a non-stochastic method of 
producing a set of finalized chimeric nucleic acid molecules having an overall assembly order 
that is chosen by design, which method is comprised of the steps of generating by design a 
plurality of specific nucleic acid building blocks having serviceable mutually compatible 
30 ligatable ends and assembling these nucleic acid building blocks, such that a designed overall 
assembly order is achieved. 

In one aspect, synthetic gene reassembly comprises a method of: 1) preparing 
a progeny generation of molecule(s) (including a molecule comprising a polynucleotide 
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sequence, e.g., a molecule comprising a polypeptide coding sequence), that is mutagenized to 
achieve at least one point mutation, addition, deletion, &/or chimerization, from one or more 
ancestral or parental generation template(s); 2) screening the progeny generation molecule(s), 
e.g., using a high throughput method, for at least one property of interest (such as an 
5 improvement in an enzyme activity); 3) optionally obtaining &/or cataloguing structural &/or 
and functional information regarding the parental &7or progeny generation molecules; and 4) 
optionally repeating any of steps 1) to 3). In one aspect, there is generated (e.g., from a 
parent polynucleotide template), in what is termed "codon site-saturation mutagenesis," a 
progeny generation of polynucleotides, each having at least one set of up to three contiguous 

10 point mutations (i.e. different bases comprising a new codon), such that every codon (or 
every family of degenerate codons encoding the same amino acid) is represented at each 
codon position. Corresponding to, and encoded by, this progeny generation of 
polynucleotides, there is also generated a set of progeny polypeptides, each having at least 
one single amino acid point mutation. In a one aspect, there is generated, in what is termed 

1 5 "amino acid site-saturation mutagenesis", one such mutant polypeptide for each of the 19 
naturally encoded polypeptide-forming alpha-amino acid substitutions at each and every 
amino acid position along the polypeptide. This yields, for each and every amino acid 
position along the parental polypeptide, a total of 20 distinct progeny polypeptides including 
the original amino acid, or potentially more than 21 distinct progeny polypeptides if 

20 additional amino acids are used either instead of or in addition to the 20 naturally encoded 
amino acids 

Thus, in another aspect, this approach is also serviceable for generating 
mutants containing, in addition to &/or in combination with the 20 naturally encoded 
polypeptide-forming alpha-amino acids, other rare &/or not naturally-encoded amino acids 

25 and amino acid derivatives. In yet another aspect, this approach is also serviceable for 

generating mutants by the use of, in addition to &/or in combination with natural or unaltered 
codon recognition systems of suitable hosts, altered, mutagenized, &/or designer codon 
recognition systems (such as in a host cell with one or more altered tRNA molecules. 

In yet another aspect, this invention relates to recombination and more 

30 specifically to a method for preparing polynucleotides encoding a polypeptide by a method of 
in vivo re-assortment of polynucleotide sequences containing regions of partial homology, 
assembling the polynucleotides to form at least one polynucleotide and screening the 
polynucleotides for the production of polypeptide(s) having a useful property. 
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In yet another aspect, this invention is serviceable for analyzing and 
cataloguing, with respect to any molecular property (e.g. an enzymatic activity) or 
combination of properties allowed by current technology, the effects of any mutational 
change achieved (including particularly saturation mutagenesis).. Thus, a comprehensive 
5 method is provided for determining the effect of changing each amino acid in a parental 
polypeptide into each of at least 19 possible substitutions. This allows each amino acid in a 
parental polypeptide to be characterized and catalogued according to its spectrum of potential 
effects on a measurable property of the polypeptide. 

In one aspect, an intron may be introduced into a chimeric progeny molecule 

10 by way of a nucleic acid building block. Introns often have consensus sequences at both 
termini in order to render them operational. In addition to enabling gene splicing, introns 
may serve an additional purpose by providing sites of homology to other nucleic acids to 
enable homologous recombination. For this purpose, and potentially others, it may be 
sometimes desirable to generate a large nucleic acid building block for introducing an intron. 

15 If the size is overly large easily generating by direct chemical synthesis of two single stranded 
oligos, such a specialized nucleic acid building block may also be generated by direct 
chemical synthesis of more than two single stranded oligos or by using a polymerase-based 
amplification reaction 

The mutually compatible ligatabie ends of the nucleic acid building blocks to 

20 be assembled are considered to be "serviceable" for this type of ordered assembly if they 
enable the building blocks to be coupled in predetermined orders. Thus, in one aspect, the 
overall assembly order in which the nucleic acid building blocks can be coupled is specified 
by the design of the ligatabie ends and, if more than one assembly step is to be used, then the 
overall assembly order in which the nucleic acid building blocks can be coupled is also 

25 specified by the sequential order of the assembly step(s). In a one aspect of the invention, the 
annealed building pieces are treated with an enzyme, such as a ligase (e.g., T4 DNA ligase) to 
achieve covalent bonding of the building pieces. 

Coupling can occur in a manner that does not make use of every nucleotide in 
a participating overhang. The coupling is particularly lively to survive (e.g. in a transformed 

30 host) if the coupling reinforced by treatment with a ligase enzyme to form what may be 

referred to as a "gap ligation" or a "gapped ligation". This type of coupling can contribute to 
generation of unwanted background product(s), but it can also be used advantageously 
increase the diversity of the progeny library generated by the designed ligation reassembly. 
Certain overhangs are able to undergo self-coupling to form a palindromic coupling. A 
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coupling is strengthened substantially if it is reinforced by treatment with a ligase enzyme. 
Lack of 5' phosphates on these overhangs can be used advantageously to prevent this type of 
palindromic self-ligation. Accordingly, this invention provides that nucleic acid building 
blocks can be chemically made (or ordered) that lack a 5' phosphate group. Alternatively, 
5 they can be removed, e.g. by treatment with a phosphatase enzyme, such as a calf intestinal 
alkaline phosphatase (CIAP), in order to prevent palindromic self-ligations in ligation 
reassembly processes. 

In a another aspect, the design of nucleic acid building blocks is obtained upon 
analysis of the sequences of a set of progenitor nucleic acid templates that serve as a basis for 
1 0 producing a progeny set of finalized chimeric nucleic acid molecules. These progenitor 

nucleic acid templates thus serve as a source of sequence information that aids in the design 
of the nucleic acid building blocks that are to be mutagenized, Le. chimerized or shuffled. 

In one exemplification, the invention provides for the chimerization of a 
family of related genes and their encoded family of related products. In a particular 
15 exemplification, the encoded products are enzymes. The xylanases of the present invention 
can be mutagenized in accordance with the methods described herein. 

Thus according to one aspect of the invention, the sequences of a plurality of 
progenitor nucleic acid templates (e.g., polynucleotides of Group' A nucleic acid sequences) 
are aligned in order to select one or more demarcation points, which demarcation points can 
20 be located at an area of homology. The demarcation points can be used to delineate the 
boundaries of nucleic acid building blocks to be generated. Thus, the demarcation points 
identified and selected in the progenitor molecules serve as potential chimerization points in 
the assembly of the progeny molecules. 

Typically a serviceable demarcation point is an area of homology (comprised 
25 of at least one homologous nucleotide base) shared by at least two progenitor templates, but 
the demarcation point can be an area of homology that is shared by at least half of the 
progenitor templates, at least two thirds of the progenitor templates, at least three fourths of 
the progenitor templates and preferably at almost all of the progenitor templates. Even more 
preferably still a serviceable demarcation point is an area of homology that is shared by all of 
30 the progenitor templates. 

In a one aspect, the gene reassembly process is performed exhaustively in 
order to generate an exhaustive library. In other words, all possible ordered combinations of 
the nucleic acid building blocks are represented in the set of finalized chimeric nucleic acid 
molecules. At the same time, the assembly order (i.e. the order of assembly of each building 
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block in the 5' to 3 sequence of each finalized chimeric nucleic acid) in each combination is 
by design (or non-stochastic). Because of the non-stochastic nature of the method, the 
possibility of unwanted side products is greatly reduced. 

In another aspect, the method provides that the gene reassembly process is 
5 performed systematically, for example to generate a systematically compartmentalized 
library, with compartments that can be screened systematically, e.g., one by one. In other 
words the invention provides that, through the selective and judicious use of specific nucleic 
acid building blocks, coupled with the selective and judicious use of sequentially stepped 
assembly reactions, an experimental design can be achieved where specific sets of progeny 

10 products are made in each of several reaction vessels. This allows a systematic examination 
and screening procedure to be performed. Thus, it allows a potentially very large number of 
progeny molecules to be examined systematically in smaller groups. 

Because of its ability to perform chimerizations in a manner that is highly 
flexible yet exhaustive and systematic as well, particularly when there is a low level of 

15 homology among the progenitor molecules, the instant invention provides for the generation 
of a library (or set) comprised of a large number of progeny molecules. Because of the non- 
stochastic nature of the instant gene reassembly invention, the progeny molecules generated 
preferably comprise a library of finalized chimeric nucleic acid molecules having an overall 
assembly order that is chosen by design. In a particularly aspect, such a generated library is 

20 comprised of greater than 1 0 3 to greater than 1 0 1000 different progeny molecular species. 

In one aspect, a set of finalized chimeric nucleic acid molecules, produced as 
described is comprised of a polynucleotide encoding a polypeptide. According to one aspect, 
this polynucleotide is a gene, which may be a man-made gene. According to another aspect, 
this polynucleotide is a gene pathway, which may be a man-made gene pathway. The 

25 invention provides that one or more man-made genes generated by the invention may be 
incorporated into a man-made gene pathway, such as pathway operable in a eukaryotic 
organism (including a plant). 

In another exemplification, the synthetic nature of the step in which the 
building blocks are generated allows the design and introduction of nucleotides {e.g., one or 

30 more nucleotides, which may be, for example, codons or introns or regulatory sequences) that 
can later be optionally removed in an in vitro process {e.g., by mutagenesis) or in an in vivo 
process {e.g. 9 by utilizing the gene splicing ability of a host organism). It is appreciated that 
in many instances the introduction of these nucleotides may also be desirable for many other 
reasons in addition to the potential benefit of creating a serviceable demarcation point. 
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Thus, according to another aspect, the invention provides that a nucleic acid 
building block can be used to introduce an intron. Thus, the invention provides that 
functional introns may be introduced into a man-made gene of the invention. The invention 
also provides that functional introns maybe introduced into a man-made gene pathway of the 
5 invention. Accordingly, the invention provides for the generation of a chimeric 

polynucleotide that is a man-made gene containing one (or more) artificially introduced 
intron(s). 

Accordingly, the invention also provides for the generation of a chimeric 
polynucleotide that is a man-made gene pathway containing one (or more) artificially 
10 introduced intron(s). Preferably, the artificially introduced intron(s) are functional in one or 
more host cells for gene splicing much in the way that naturally-occurring introns serve 
functionally in gene splicing. The invention provides a process of producing man-made 
intron-containing polynucleotides to be introduced into host organisms for recombination 
and/or splicing. 

15 A man-made gene produced using the invention can also serve as a substrate 

for recombination with another nucleic acid. Likewise, a man-made gene pathway produced 
using the invention can also serve as a substrate for recombination with another nucleic acid. 
In a one aspect, the recombination is facilitated by, or occurs at, areas of homology between 
the man-made, intron-containing gene and a nucleic acid, which serves as a recombination 

20 partner. In one aspect, the recombination partner may also be a nucleic acid generated by the 
invention, including a man-made gene or a man-made gene pathway. Recombination may be 
facilitated by or may occur at areas of homology that exist at the one (or more) artificially 
introduced intron(s) in the man-made gene. 

The synthetic gene reassembly method of the invention utilizes a plurality of 

25 nucleic acid building blocks, each of which preferably has two ligatable ends. The two 

ligatable ends on each nucleic acid building block may be two blunt ends (i.e. each having an 
overhang of zero nucleotides), or preferably one blunt end and one overhang, or more 
preferably still two overhangs. 

A useful overhang for this purpose may be a 3' overhang or a 5* overhang. 

30 Thus, a nucleic acid building block may have a 3' overhang or alternatively a 5' overiiang or 
alternatively two 3 9 overhangs or alternatively two 5' overhangs. The overall order in which 
the nucleic acid building blocks are assembled to form a finalized chimeric nucleic acid 
molecule is determined by purposeful experimental design and is not random. 
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In one aspect, a nucleic acid building block is generated by chemical synthesis 
of two single-stranded nucleic acids (also referred to as single-stranded oligos) and contacting 
them so as to allow them to anneal to form a double-stranded nucleic acid building block. 

A double-stranded nucleic acid building block can be of variable size. The 
5 sizes of these building blocks can be small or large. Exemplary sizes for building block 

range from 1 base pair (not including any overhangs) to 100,000 base pairs (not including any 
overhangs). Other exemplary size ranges are also provided, which have lower limits of from 
1 bp to 10,000 bp (including every integer value in between) and upper limits of from 2 bp to 
100,000 bp (including every integer value in between). 
10 Many methods exist by which a double-stranded nucleic acid building block 

can be generated that is serviceable for the invention; and these are known in the art and can 
be readily performed by the skilled artisan. 

According to one aspect, a double-stranded nucleic acid building block is 
generated by first generating two single stranded nucleic acids and allowing them to anneal to 
15 form a double-stranded nucleic acid building block. The two strands of a double-stranded 
nucleic acid building block may be complementary at every nucleotide apart from any that 
form an overhang; thus containing no mismatches, apart from any overhang(s). According to 
another aspect, the two strands of a double-stranded nucleic acid building block are 
complementary at fewer than every nucleotide apart from any that form an overhang. Thus, 
20 according to this aspect, a double-stranded nucleic acid building block can be used to 

introduce codon degeneracy. Preferably the codon degeneracy is introduced using the site- 
saturation mutagenesis described herein, using one or more N,N,G/T cassettes or alternatively 
using one or more N,N,N cassettes. 

The in vivo recombination method of the invention can be performed blindly 
25 on a pool of unknown hybrids or alleles of a specific polynucleotide or sequence. However, 
it is not necessary to know the actual DNA or RNA sequence of the specific polynucleotide. 

The approach of using recombination within a mixed population of genes can 
be useful for the generation of any useful proteins, for example, interleukin I, antibodies, tPA 
and growth hormone. This approach may be used to generate proteins having altered 
30 specificity or activity. The approach may also be useful for the generation of hybrid nucleic 
acid sequences, for example, promoter regions, introns, exons, enhancer sequences, 31 
untranslated regions or 51 untranslated regions of genes. Thus this approach may be used to 
generate genes having increased rates of expression. This approach may also be useful in the 
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study of repetitive DNA sequences. Finally, this approach may be useful to mutate 
ribozymes or aptamers. 

In one aspect the invention described herein is directed to the use of repeated 
cycles of reductive reassortment, recombination and selection which allow for the directed 
5 molecular evolution of highly complex linear sequences, such as DNA, RNA or proteins 
thorough recombination. 

Optimized Directed Evolution System 

The invention provides a non-stochastic gene modification system termed 
"optimized directed evolution system" to generate polypeptides, e.g., xylanases or antibodies 

10 of the invention, with new or altered properties. Optimized directed evolution is directed to 
the use of repeated cycles of reductive reassortment, recombination and selection that allow 
for the directed molecular evolution of nucleic acids through recombination. Optimized 
directed evolution allows generation of a large population of evolved chimeric sequences, 
wherein the generated population is significantly enriched for sequences that have a 

1 5 predetermined number of crossover events. 

A crossover event is a point in a chimeric sequence where a shift in sequence 
occurs from one parental variant to another parental variant. Such a point is normally at the 
juncture of where oligonucleotides from two parents are ligated together to form a single 
sequence. This method allows calculation of the correct concentrations of oligonucleotide 

20 sequences so that the final chimeric population of sequences is enriched for the chosen 
number of crossover events. This provides more control over choosing chimeric variants 
having a predetermined number of crossover events. 

In addition, this method provides a convenient means for exploring a 
tremendous amount of the possible protein variant space in comparison to other systems. 

25 Previously, if one generated, for example, 10 13 chimeric molecules during a reaction, it would 
be extremely difficult to test such a high number of chimeric variants for a particular activity. 
Moreover, a significant portion of the progeny population would have a very high number of 
crossover events which resulted in proteins that were less likely to have increased levels of a 
particular activity. By using these methods, the population of chimerics molecules can be 

30 enriched for those variants that have a particular number of crossover events. Thus, although 
one can still generate 10 13 chimeric molecules during a reaction, each of the molecules 
chosen for further analysis most likely has, for example, only three crossover events. 
Because the resulting progeny population can be skewed to have a predetermined number of 
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crossover events, the boundaries on the functional variety between the chimeric molecules is 
reduced. This provides a more manageable number of variables when calculating which 
oligonucleotide from the original parental polynucleotides might be responsible for affecting 
a particular trait. 

5 One method for creating a chimeric progeny polynucleotide sequence is to 

create oligonucleotides corresponding to fragments or portions of each parental sequence. 
Each oligonucleotide preferably includes a unique region of overlap so that mixing the 
oligonucleotides together results in a new variant that has each oligonucleotide fragment 
assembled in the correct order. Additional information can also be found, e.g., in USSN 

10 09/332,835; U.S. Patent No. 6,361,974. 

The number of oligonucleotides generated for each parental variant bears a 
relationship to the total number of resulting crossovers in the chimeric molecule that is 
ultimately created. For example, three parental nucleotide sequence variants might be 
provided to undergo a ligation reaction in order to find a chimeric variant having, for 

1 5 example, greater activity at high temperature. As one example, a set of 50 oligonucleotide 
sequences can be generated corresponding to each portions of each parental variant. 
Accordingly, during the ligation reassembly process there could be up to 50 crossover events 
within each of the chimeric sequences. The probability that each of the generated chimeric 
polynucleotides will contain oligonucleotides from each parental variant in alternating order 

20 is very low. If each oligonucleotide fragment is present in the ligation reaction in the same 
molar quantity it is likely that in some positions oligonucleotides from the same parental 
polynucleotide will ligate next to one another and thus not result in a crossover event. If the 
concentration of each oligonucleotide from each parent is kept constant during any ligation 
step in this example, there is a 1/3 chance (assuming 3 parents) that an oligonucleotide from 

25 the same parental variant will ligate within the chimeric sequence and produce no crossover. 

Accordingly, a probability density function (PDF) can be determined to 
predict the population of crossover events that are likely to occur during each step in a 
ligation reaction given a set number of parental variants, a number of oligonucleotides 
corresponding to each variant, and the concentrations of each variant during each step in the 

30 ligation reaction. The statistics and mathematics behind determining the PDF is described 
below. By utilizing these methods, one can calculate such a probability density function, and 
thus enrich the chimeric progeny population for a predetermined number of crossover events 
resulting from a particular ligation reaction. Moreover, a target number of crossover events 
can be predetermined, and the system then programmed to calculate the starting quantities of 
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each parental oligonucleotide during each step in the ligation reaction to result in a 
probability density function that centers on the predetermined number of crossover events. 
These methods are directed to the use of repeated cycles of reductive reassortment, 
recombination and selection that allow for the directed molecular evolution of a nucleic acid 
encoding a polypeptide through recombination. This system allows generation of a large 
population of evolved chimeric sequences, wherein the generated population is significantly 
enriched for sequences that have a predetermined number of crossover events. A crossover 
event is a point in a chimeric sequence where a shift in sequence occurs from one parental 
variant to another parental variant Such a point is normally at the juncture of where 
oligonucleotides from two parents are ligated together to form a single sequence. The 
method allows calculation of the correct concentrations of oligonucleotide sequences so that 
the final chimeric population of sequences is enriched for the chosen number of crossover 
events. This provides more control over choosing chimeric variants having a predetermined 
number of crossover events. 

In addition, these methods provide a convenient means for exploring a 
tremendous amount of the possible protein variant space in comparison to other systems. By 
using the methods described herein, the population of chimerics molecules can be enriched 
for those variants that have a particular number of crossover events. Thus, although one can 
still generate 10 13 chimeric molecules during a reaction, each of the molecules chosen for 
further analysis most likely has, for example, only three crossover events. Because the 
resulting progeny population can be skewed to have a predetermined number of crossover 
events, the boundaries on the functional variety between the chimeric molecules is reduced. 
This provides a more manageable number of variables when calculating which 
oligonucleotide from the original parental polynucleotides might be responsible for affecting 
a particular trait. 

In one aspect, the method creates a chimeric progeny polynucleotide sequence 
by creating oligonucleotides corresponding to fragments or portions of each parental 
sequence. Each oligonucleotide preferably includes a unique region of overlap so that mixing 
the oligonucleotides together results in a new variant that has each oligonucleotide fragment 
assembled in the correct order. See also USSN 09/332,835 . 

Determining Crossover Events 

Aspects of the invention include a system and software that receive a desired 
crossover probability density function (PDF), the number of parent genes to be reassembled, 
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and the number of fragments in the reassembly as inputs. The output of this program is a 
"fragment PDF' that can be used to detemune a recipe for producing reassembled genes, and 
the estimated crossover PDF of those genes. The processing described herein is preferably 
performed in MATLAB™ (The Mathworks, Natick, Massachusetts) a programming language 
and development environment for technical computing. 

Iterative Processes 

In practicing the invention, these processes can be iteratively repeated. For 
example, a nucleic acid (or, the nucleic acid) responsible for an altered or new xylanase 
phenotype is identified, re-isolated, again modified, re-tested for activity. This process can 
be iteratively repeated until a desired phenotype is engineered. For example, an entire 
biochemical anabolic or catabolic pathway can be engineered into a cell, including, e.g., 
xylanase activity. 

Similarly, if it is determined that a particular oligonucleotide has no affect at 
all on the desired trait (e.g., a new xylanase phenotype), it can be removed as a variable by 
synthesizing larger parental oligonucleotides that include the sequence to be removed. Since 
incorporating the sequence within a larger sequence prevents any crossover events, there will 
no longer be any variation of this sequence in the progeny polynucleotides. This iterative 
practice of detennining which oligonucleotides are most related to the desired trait, and 
which are unrelated, allows more efficient exploration all of the possible protein variants that 
might be provide a particular trait or activity. 

In vivo shuffling 

In vivo shuffling of molecules is use in methods of the invention that provide 
variants of polypeptides of the invention, e.g., antibodies, xylanases, and the like. In vivo 
shuffling can be performed utilizing the natural property of cells to recombine multimers. 
While recombination in vivo has provided the major natural route to molecular diversity, 
genetic recombination remains a relatively complex process that involves 1) the recognition 
of homologies; 2) strand cleavage, strand invasion, and metabolic steps leading to the 
production of recombinant chiasma; and finally 3) the resolution of chiasma into discrete 
recombined molecules. The formation of the chiasma requires the recognition of 
homologous sequences. 

In another aspect, the invention includes a method for producing a hybrid 
polynucleotide from at least a first polynucleotide and a second polynucleotide. The 
invention can be used to produce a hybrid polynucleotide by introducing at least a first 
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polynucleotide and a second polynucleotide which share at least one region of partial 
sequence homology {e.g., SEQ ID NOS: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 
31, 33, 35, 37, 39, 41, 43, 45 , 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 
81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 

5 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 
159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 
195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 
231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255, 257 and combinations 
thereof) into a suitable host cell. The regions of partial sequence homology promote 

10 processes which result in sequence reorganization producing a hybrid polynucleotide. The 
term "hybrid polynucleotide", as used herein, is any nucleotide sequence which results from 
the method of the present invention and contains sequence from at least two original 
polynucleotide sequences. Such hybrid polynucleotides can result from intennolecular 
recombination events which promote sequence integration between DNA molecules. In 

15 addition, such hybrid polynucleotides can result from intramolecular reductive reassortment 
processes which utilize repeated sequences to alter a nucleotide sequence within a DNA 
molecule. 

In vivo reassortment is focused on "inter-molecular" processes collectively 
referred to as "recombination" which in bacteria, is generally viewed as a "RecA-dependent" 
20 phenomenon. The invention can rely on recombination processes of a host cell to recombine 
and re-assort sequences, or the cells' ability to mediate reductive processes to decrease the 
complexity of quasi-repeated sequences in the cell by deletion. This process of "reductive 
reassortment" occurs by an "intra-molecular", RecA-independent process. 

Therefore, in another aspect of the invention, novel polynucleotides can be 
25 generated by the process of reductive reassortment The method involves the generation of 
constructs containing consecutive sequences (original encoding sequences), their insertion 
into an appropriate vector and their subsequent introduction into an appropriate host cell. 
The reassortment of the individual molecular identities occurs by combinatorial processes 
between the consecutive sequences in the construct possessing regions of homology, or 
30 between quasi-repeated units. The reassortment process recombines and/or reduces the 
complexity and extent of the repeated sequences and results in the production of novel 
molecular species. Various treatments may be applied to enhance the rate of reassortment. 
These could include treatment with ultra-violet light, or DNA damaging chemicals and/or the 
use of host cell lines displaying enhanced levels of "genetic instability". Thus the 
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reassortment process may involve homologous recombination or the natural property of 
quasi-repeated sequences to direct their own evolution. 

Repeated or "quasi-repeated" sequences play a role in genetic instability. In 
the present invention, "quasi-repeats" are repeats that are not restricted to their original unit 
structure. Quasi-repeated units can be presented as an array of sequences in a construct; 
consecutive units of similar sequences. Once ligated, the junctions between the consecutive 
sequences become essentially invisible and the quasi-repetitive nature of the resulting 
construct is now continuous at the molecular level. The deletion process the cell performs to 
reduce the complexity of the resulting construct operates between the quasi-repeated 
sequences. The quasi-repeated units provide a practically limitless repertoire of templates 
upon which slippage events can occur. The constructs containing the quasi-repeats thus 
effectively provide sufficient molecular elasticity that deletion (and potentially insertion) 
events can occur virtually anywhere within the quasi-repetitive units. 

When the quasi-repeated sequences are all ligated in the same orientation, for 
instance head to tail or vice versa, the cell cannot distinguish individual units. Consequently, 
the reductive process can occur throughout the sequences. In contrast, when for example, the 
units are presented head to head, rather than head to tail, the inversion delineates the 
endpoints of the adjacent unit so that deletion formation will favor the loss of discrete units. 
Thus, it is preferable with the present method that the sequences are in the same orientation. 
Random orientation of quasi-repeated sequences will result in the loss of reassortment 
efficiency, while consistent orientation of the sequences will offer the highest efficiency. 
However, while having fewer of the contiguous sequences in the same orientation decreases 
the efficiency, it may still provide sufficient elasticity for the effective recovery of novel 
molecules. Constructs can be made with the quasi-repeated sequences in the same orientation 

to allow higher efficiency. 

Sequences can be assembled in a head to tail orientation using any of a variety 

of methods, including the following: 

a) Primers that include a poly-A head and poly-T tail which when made single- 
stranded would provide orientation can be utilized. This is accomplished by 
having the first few bases of the primers made from RNA and hence easily 
removed RNAseH. 

b) Primers that include unique restriction cleavage sites can be utilized. Multiple 
sites, a battery of unique sequences and repeated synthesis and ligation steps 
would be required. 
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c) The inner few bases of the primer could be thiolated and an exonuclease used to 
produce properly tailed molecules. 
The recovery of the re-assorted sequences relies on the identification of 
cloning vectors with a reduced repetitive index (RI). The re-assorted encoding sequences can 
5 then be recovered by amplification. The products are re-cloned and expressed. The recovery 
of cloning vectors with reduced RI can be affected by: 

1) The use of vectors only stably maintained when the construct is reduced in 
complexity. 

2) The physical recovery of shortened vectors by physical procedures. In this case, the 
10 cloning vector would be recovered using standard plasmid isolation procedures and 

size fractionated on either an agarose gel, or column with a low molecular weight cut 
off utilizing standard procedures. 

3) The recovery of vectors containing interrupted genes which can be selected when 
insert size decreases. 

15 4) The use of direct selection techniques with an expression vector and the appropriate 
selection. 

Encoding sequences (for example, genes) from related organisms may 
demonstrate a high degree of homology and encode quite diverse protein products. These 
types of sequences are particularly useful in the present invention as quasi-repeats. However, 
20 while the examples illustrated below demonstrate the reassortment of nearly identical original 
encoding sequences (quasi-repeats), this process is not limited to such nearly identical 
repeats. 

The following example demonstrates a method of the invention. Encoding 
nucleic acid sequences (quasi-repeats) derived from three (3) unique species are described. 

25 Each sequence encodes a protein with a distinct set of properties. Each of the sequences 
differs by a single or a few base pairs at a unique position in the sequence. The quasi- 
repeated sequences are separately or collectively amplified and ligated into random 
assemblies such that all possible permutations and combinations are available in the 
population of ligated molecules. The number of quasi-repeat units can be controlled by the 

30 assembly conditions. The average number of quasi-repeated units in a construct is defined as 
the repetitive index (RI). 

Once formed, the constructs may, or may not be size fractionated on an 
agarose gel according to published protocols, inserted into a cloning vector and transfected 



132 



WO 03/106654 PCT/US03/19153 
into an appropriate host cell. The cells are then propagated and "reductive reassortment" is 
effected. The rate of the reductive reassortment process may be stimulated by the 
introduction of DNA damage if desired. Whether the reduction in RI is mediated by deletion 
formation between repeated sequences by an "intra-molecular" mechanism, or mediated by 
recombination-like events through "inter-molecular'' mechanisms is immaterial. The end 
result is a reassortment of the molecules into all possible combinations. 

Optionally, the method comprises the additional step of screening the library 
members of the shuffled pool to identify individual shuffled library members having the 
ability to bind or otherwise interact, or catalyze a particular reaction (e.g. 9 such as catalytic 
domain of an enzyme) with a predetermined macromolecule, such as for example a 
proteinaceous receptor, an oligosaccharide, virion, or other predetermined compound or 
structure. 

The polypeptides that are identified from such libraries can be used for 
therapeutic, diagnostic, research and related purposes (e.g., catalysts, solutes for increasing 
osmolality of an aqueous solutiop and the like) and/or can be subjected to one or more 
additional cycles of shuffling and/or selection. 

In another aspect, it is envisioned that prior to or during recombination or 
reassortment, polynucleotides generated by the method of the invention can be subjected to 
agents or processes which promote the introduction of mutations into the original 
polynucleotides. The introduction of such mutations would increase the diversity of resulting 
hybrid polynucleotides and polypeptides encoded therefrom. The agents or processes which 
promote mutagenesis can include, but are not limited to: (+)-CC-1065, or a synthetic analog 
such as (+)-CC-1065-(N3-Adenine (See Sun and Hurley, (1992); an N-acetylated or 
deacetylated 4'-fluro-4~aminobiphenyl adduct capable of inhibiting DNA synthesis (See , for 
example, van de Poll et a!. (1992)); or a N-acetylated or deacetylated 4-aminobiphenyl 
adduct capable of inhibiting DNA synthesis (See also, van de Poll et al (1992), pp. 751-758); 
trivalent chromium, a trivalent chromium salt, a polycyclic aromatic hydrocarbon (PAH) 
DNA adduct capable of inhibiting DNA replication, such as 7-bromomethyl- 
benz[a]anthracene ("BMA"), tris(2,3-dibromopropyl)phosphate ("Tris-BP"), l,2-dibromo-3- 
chloropropane ("DBCP"), 2-bromoacrolein (2BA), benzo[a]pyrene-7,8-dihydrodiol-9-10- 
epoxide ("BPDE"), a platinum(II) halogen salt, N-hydroxy-2-amino-3-methylimidazo[4,5-y]- 
quinoline (' < N-hydroxy-IQ") andN-hydroxy-2-amino-l-methyl-6-phenylimidazo[4,5-/]- 
pyridine ('Wiydroxy-PhlP"), Exemplary means for slowing or halting PCR amplification 
consist of UV light (+)-CC-1065 and (+)-CC-1065-(N3-Adenine). Particularly encompassed 
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means are DNA adducts or polynucleotides comprising the DNA adducts from the 
polynucleotides or polynucleotides pool, which can be released or removed by a process 
including heating the solution comprising the polynucleotides prior to further processing. 

In another aspect the invention is directed to a method of producing 
recombinant proteins having biological activity by treating a sample comprising double- 
stranded template polynucleotides encoding a wild-type protein under conditions according to 
the invention which provide for the production of hybrid or re-assorted polynucleotides. 

Producing sequence variants 

The invention also provides additional methods for making sequence variants 
of the nucleic acid (e.g., xylanase) sequences of the invention. The invention also provides 
additional methods for isolating xylanases using the nucleic acids and polypeptides of the 
invention. In one aspect, the invention provides for variants of a xylanase coding sequence 
(e.g., a gene, cDNA or message) of the invention, which can be altered by any means, 
including, e.g., random or stochastic methods, or, non-stochastic, or "directed evolution," 
methods, as described above. 

The isolated variants may be naturally occurring. Variant can also be created 
in vitro. Variants may be created using genetic engineering techniques such as site directed 
mutagenesis, random chemical mutagenesis, Exonuclease in deletion procedures, and 
standard cloning techniques. Alternatively, such variants, fragments, analogs, or derivatives 
may be created using chemical synthesis or modification procedures. Other methods of 
making variants are also familiar to those skilled in the art. These include procedures in 
which nucleic acid sequences obtained from natural isolates are modified to generate nucleic 
acids which encode polypeptides having characteristics which enhance their value in 
industrial or laboratory applications. In such procedures, a large number of variant sequences 
having one or more nucleotide differences with respect to the sequence obtained from the 
natural isolate are generated and characterized. These nucleotide differences can result in 
amino acid changes with respect to the polypeptides encoded by the nucleic acids from the 
natural isolates. 

For example, variants may be created using error prone PCR. In error prone 
PCR, PCR is performed under conditions where the copying fidelity of the DNA polymerase 
is low, such that a high rate of point mutations is obtained along the entire length of the PCR 
product. Error prone PCR is described, e.g., in Leung, D. W., et al., Technique, 1:11-15, 
1989) and Caldwell, R. C. & Joyce G.F., PCR Methods Applic, 2:28-33, 1992. Briefly, in 
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such procedures, nucleic acids to be mutagenized are mixed with PCR primers, reaction 
buffer, MgCl 2 , MnCl 2 , Taq polymerase and an appropriate concentration of dNTPs for 
achieving a high rate of point mutation along the entire length of the PCR product. For 
example, the reaction may be performed using 20 finoles of nucleic acid to be mutagenized, 
5 30 pmole of each PCR primer, a reaction buffer comprising 50mM KC1, lOmM Tris HC1 (pH 
8.3) and 0.01% gelatin, 7mM MgC12, 0.5mM MnCl 2 , 5 units of Taq polymerase, 0.2mM 
dGTP, 0.2mM dATP, ImM dCTP, and ImM dTTP. PCR may be performed for 30 cycles of 
94°C for 1 min, 45°C for 1 min, and 72°C for 1 min. However, it will be appreciated that 
these parameters may be varied as appropriate. The mutagenized nucleic acids are cloned 
10 into an appropriate vector and the activities of the polypeptides encoded by the mutagenized 
nucleic acids are evaluated. 

Variants may also be created using oligonucleotide directed mutagenesis to 
generate site-specific mutations in any cloned DNA of interest. Oligonucleotide mutagenesis 
is described, e.g., in Reidhaar-Olson (1988) Science 241 :53-57. Briefly, in such procedures a 
15 plurality of double stranded oligonucleotides bearing one or more mutations to be introduced 
into the cloned DNA are synthesized and inserted into the cloned DNA to be mutagenized. 
Clones containing the mutagenized DNA are recovered and the activities of the polypeptides 
they encode are assessed. 

Another method for generating variants is assembly PCR. Assembly PCR 
20 involves the assembly of a PCR product from a mixture of small DNA fragments. A large 

number of different PCR reactions occur in parallel in the same vial, with the products of one 
reaction priming the products of another reaction. Assembly PCR is described in, e.g., U.S. 
Patent No. 5,965,408. 

Still another method of generating variants is sexual PCR mutagenesis. In 
25 sexual PCR mutagenesis, forced homologous recombination occurs between DNA molecules 
of different but highly related DNA sequence in vitro, as a result of random fragmentation of 
the DNA molecule based on sequence homology, followed by fixation of the crossover by 
primer extension in a PCR reaction. Sexual PCR mutagenesis is described, e.g., in Stemmer 
(1994) Proc. Natl. Acad. Sci. USA 91:10747-10751. Briefly, in such procedures a plurality 
30 of nucleic acids to be recombined are digested with DNase to generate fragments having an 
average size of 50-200 nucleotides. Fragments of the desired average size are purified and 
resuspended in a PCR mixture. PCR is conducted under conditions which facilitate 
recombination between the nucleic acid fragments. For example, PCR may be performed by 
resuspending the purified fragments at a concentration of 10-30ng/|*l in a solution of 0.2mM 
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of each dNTP, 2.2mM MgCl 2 , 50mM KCL, lOmM Tris HC1, pH 9.0, and 0.1% Triton X-100. 
2.5 units of Taq polymerase per 100:1 of reaction mixture is added and PCR is performed 
using the following regime: 94°C for 60 seconds, 94°C for 30 seconds, 50-55°C for 30 
seconds, 72°C for 30 seconds (30-45 times) and 72°C for 5 minutes. However, it will be 
5 appreciated that these parameters may be varied as appropriate. In some aspects, 

oligonucleotides may be included in the PCR reactions. In other aspects, the Klenow 
fragment of DNA polymerase I maybe used in a first set of PCR reactions and Taq 
polymerase may be used in a subsequent set of PCR reactions. Recombinant sequences are 
isolated and the activities of the polypeptides they encode are assessed. 

1 0 Variants may also be created by in vivo mutagenesis. In some aspects, random 

mutations in a sequence of interest are generated by propagating the sequence of interest in a 
bacterial strain, such as an E. coli strain, which carries mutations in one or more of the DNA 
repair pathways. Such "mutator" strains have a higher random mutation rate than that of a 
wild-type parent. Propagating the DNA in one of these strains will eventually generate 

1 5 random mutations within the DNA. Mutator strains suitable for use for in vivo mutagenesis 
are described in PCT Publication No. WO 91/16427, published October 3 1, 1991, entitled 
"Methods for Phenotype Creation from Multiple Gene Populations". 

Variants may also be generated using cassette mutagenesis. In cassette 
mutagenesis a small region of a double stranded DNA molecule is replaced with a synthetic 

20 oligonucleotide "cassette" that differs from the native sequence. The oligonucleotide often 
contains completely and/or partially randomized native sequence. 

Recursive ensemble mutagenesis may also be used to generate variants. 
Recursive ensemble mutagenesis is an algorithm for protein engineering (protein 
mutagenesis) developed to produce diverse populations of phenotypically related mutants 

25 whose members differ in amino acid sequence. This method uses a feedback mechanism to 
control successive rounds of combinatorial cassette mutagenesis. Recursive ensemble 
mutagenesis is described in Arkin, A.P. and Youvan, D.C., PNAS, USA, 89:781 1-7815, 
1992. 

In some aspects, variants are created using exponential ensemble mutagenesis. 
30 Exponential ensemble mutagenesis is a process for generating combinatorial libraries with a 
high percentage of unique and functional mutants, wherein small groups of residues are 
randomized in parallel to identify, at each altered position, amino acids which lead to 
functional proteins. Exponential ensemble mutagenesis is described in Delegrave, S. and 
Youvan, D.C., Biotechnology Research, 11:1548-1552, 1993. Random and site-directed 
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mutagenesis are described in Arnold, F.H., Current Opinion in Biotechnology, 4:450-455, 
1993. 

In some aspects, the variants are created using shuffling procedures wherein 
portions of a plurality of nucleic acids which encode distinct polypeptides are fused together 
5 to create chimeric nucleic acid sequences which encode chimeric polypeptides as described in 
U.S. Patent No. 5,965,408, filed July 9, 1996, entitled, "Method of DNA Reassembly by 
Interrupting Synthesis'* and U.S. Patent No. 5,939,250, filed May 22, 1996, entitled, 
"Production of Enzymes Having Desired Activities by Mutagenesis. 

The variants of the polypeptides of Group B amino acid sequences may be 
10 variants in which one or more of the amino acid residues of the polypeptides of the Group B 
amino acid sequences are substituted with a conserved or non-conserved amino acid residue 
(preferably a conserved amino acid residue) and such substituted amino acid residue may or 
may not be one encoded by the genetic code. 

Conservative substitutions are those that substitute a given amino acid in a 
15 polypeptide by another amino acid of like characteristics. Typically seen as conservative 
. substitutions are the following replacements: replacements of an aliphatic amino acid such as 
Alanine, Valine, Leucine and Isoleucine with another aliphatic amino acid; replacement of a 
Serine with a Threonine or vice versa; replacement of an acidic residue such as Aspartic acid 
and Glutamic acid with another acidic residue; replacement of a residue bearing an amide 
20 group, such as Asparagine and Glutamine, with another residue bearing an amide group; 
exchange of a basic residue such as Lysine and Arginine with another basic residue; and 
replacement of an aromatic residue such as Phenylalanine, Tyrosine with another aromatic 
residue. 

Other variants are those in which one or more of the amino acid residues of the 
25 polypeptides of the Group B amino acid sequences includes a substituent group. 

Still other variants are those in which the polypeptide is associated with 
another compound, such as a compound to increase the half-life of the polypeptide (for 
example, polyethylene glycol). 

Additional variants are those in which additional amino acids are fused to the 
30 polypeptide, such as a leader sequence, a secretory sequence, a proprotein sequence or a 
sequence which facilitates purification, enrichment, or stabilization of the polypeptide. 

In some aspects, the fragments, derivatives and analogs retain the same 
biological function or activity as the polypeptides of Group B amino acid sequences and 
sequences substantially identical thereto. In other aspects, the fragment, derivative, or analog 
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includes a proprotein, such that the fragment, derivative, or analog can be activated by 
cleavage of the proprotein portion to produce an active polypeptide. 

Optimizing codons to achieve high levels of protein expression in host cells 

The invention provides methods for modifying xylanase-encoding nucleic 
5 acids to modify codon usage. In one aspect, the invention provides methods for modifying 
codons in a nucleic acid encoding a xylanase to increase or decrease its expression in a host 
cell. The invention also provides nucleic acids encoding a xylanase modified to increase its 
expression in a host cell, xylanase so modified, and methods of making the modified 
xylanases. The method comprises identifying a "non-preferred" or a "less preferred" codon 

10 in xylanase-encoding nucleic acid and replacing one or more of these non-preferred or less 
preferred codons with a '^preferred codon" encoding the same amino acid as the replaced 
codon and at least one non-preferred or less preferred codon in the nucleic acid has been 
replaced by a preferred codon encoding the same amino acid. A preferred codon is a codon 
over-represented in coding sequences in genes in the host cell and a non-preferred or less 

15 preferred codon is a codon under-represented in coding sequences in genes in the host cell. 

Host cells for expressing the nucleic acids, expression cassettes and vectors of 
the invention include bacteria, yeast, fungi, plant cells, insect cells and mammalian cells. 
Thus, the invention provides methods for optimizing codon usage in all of these cells, codon- , 
altered nucleic acids and polypeptides made by the codon-altered nucleic acids. Exemplary 

20 host cells include gram negative bacteria, such as Escherichia coli and Pseudomonas 

fluorescens; gram positive bacteria, such as Streptomyces diversa, Lactobacillus gasseri 9 
Lactococcus lactis, Lactococcus cremoris, Bacillus subtilis. Exemplary host cells also 
include eukaryotic organisms, e.g., various yeast, such as Saccharomyces sp., including 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pichia pastoris^ and Kluyveromyces 

25 lactis, Hansenula polymorpha y Aspergillus niger, and mammalian cells and cell lines and 
insect cells and cell lines. Thus, the invention also includes nucleic acids and polypeptides 
optimized for expression in these organisms and species. 

For example, the codons of a nucleic acid encoding a xylanase isolated from a 
bacterial cell are modified such that the nucleic acid is optimally expressed in a bacterial cell 

30 different from the bacteria from which the xylanase was derived, a yeast, a fungi, a plant cell, 
an insect cell or a mammalian cell. Methods for optimizing codons are well known in the art, 
see, e.g., U.S. Patent No. 5,795,737; Baca (2000) Int. J. Parasitol. 30:113-118; Hale (1998) 
Protein Expr. Purif. 12:185-188; Narum (2001) Infect. Immun. 69:7250-7253. See also 
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Narum (2001) Infect. Immun. 69:7250-7253, describing optimizing codons in mouse 
systems; Outchkourov (2002) Protein Expr. Purif. 24:18-24, describing optimizing codons in 
yeast; Feng (2000) Biochemistry 39:15399-15409, describing optimizing codons in E. coli; 
Humphreys (2000) Protein Expr. Purif. 20:252-264, describing optimizing codon usage that 
5 affects secretion in E. coli. 

Transgenic non-human animals 

The invention provides transgenic non-human animals comprising a nucleic 

acid, a polypeptide (e.g., a xylanase), an expression cassette or vector or a transfected or 

transformed cell of the invention. The invention also provides methods of making and using 

1 0 these transgenic non-human animals. 

The transgenic non-human animals can be, e.g., goats, rabbits, sheep, pigs, 
cows, rats and mice, comprising the nucleic acids of the invention. These animals can be 
used, e.g., as in vivo models to study xylanase activity, or, as models to screen for agents that 
change the xylanase activity in vivo. The coding sequences for the polypeptides to be 

15 expressed in the transgenic non-human animals can be designed to be constitutive, or, under, 
the control of tissue-specific, developmental-specific or inducible transcriptional regulatory 
factors. Transgenic non-human animals can be designed and generated using any method 
known in the art; see, e.g., U.S. Patent Nos. 6,211,428; 6,187,992; 6,156,952; 6,118,044; 
6,111,166; 6,107,541; 5,959,171; 5,922,854; 5,892,070; 5,880,327; 5,891,698; 5,639,940; 

20 5,573,933; 5,387,742; 5,087,571, describing making and using transformed cells and eggs 
and transgenic mice, rats, rabbits, sheep, pigs and cows. See also, e.g., Pollock (1999) J. 
Immunol. Methods 231:147-157, describing the production of recombinant proteins in the 
milk of transgenic dairy animals; Baguisi (1999) Nat. Biotechnol. 17:456-461, demonstrating 
the production of transgenic goats. U.S. Patent No. 6,21 1,428, describes making and using 

25 transgenic non-human mammals which express in their brains a nucleic acid construct 
comprising a DNA sequence. U.S. Patent No. 5,387,742, describes injecting cloned 
recombinant or synthetic DNA sequences into fertilized mouse eggs, implanting the injected 
eggs in pseudo-pregnant females, and growing to term transgenic mice whose cells express 
proteins related to the pathology of Alzheimer's disease. U.S. Patent No. 6,187,992, 

30 describes making and using a transgenic mouse whose genome comprises a disruption of the 
gene encoding amyloid precursor protein (APP). 

"Knockout animals" can also be used to practice the methods of the invention. 
For example, in one aspect, the transgenic or modified animals of the invention comprise a 
"knockout animal," e.g., a "knockout mouse," engineered not to express an endogenous gene, 
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which is replaced with a gene expressing a xylanase of the invention, or, a fusion protein 
comprising a xylanase of the invention. 

Transgenic Plants and Seeds 

The invention provides transgenic plants and seeds comprising a nucleic acid, 

5 a polypeptide (e.g., a xylanase), an expression cassette or vector or a transfected or 

transformed cell of the invention. The invention also provides plant products, e.g., oils, 

seeds, leaves, extracts and the like, comprising a nucleic acid and/or a polypeptide (e.g., a 

xylanase) of the invention. The transgenic plant can be dicotyledonous (a dicot) or 

monocotyledonous (a monocot). The invention also provides methods of making and using 

10 these transgenic plants and seeds. The transgenic plant or plant cell expressing a polypeptide 
of the present invention may be constructed in accordance with any method known in the art. 
See, for example, U.S. Patent No. 6,309,872. 

Nucleic acids and expression constructs of the invention can be introduced 
into a plant cell by any means. For example, nucleic acids or expression constructs can be 

15 introduced into the genome of a desired plant host, or, the nucleic acids or expression 

constructs can be episomes. Introduction into the genome of a desired plant can be such that 
the host's xylanase production is regulated by endogenous transcriptional or translational 
control elements. The invention also provides "knockout plants" where insertion of gene 
sequence by, e.g., homologous recombination, has disrupted the expression of the 

20 endogenous gene. Means to generate c< knockout" plants are well-known in the art, see, e.g., 
Strepp (1998) Proc Natl. Acad. Sci. USA 95:4368-4373; Miao (1995) Plant J 7:359-365. See 
discussion on transgenic plants, below. 

The nucleic acids of the invention can be used to confer desired traits on 
essentially any plant, e.g., on starch-producing plants, such as potato, wheat, rice, barley, and 

25 the like. Nucleic acids of the invention can be used to manipulate metabolic pathways of a 
plant in order to optimize or alter host's expression of xylanase. The can change xylanase 
activity in a plant. Alternatively, a xylanase of the invention can be used in production of a 
transgenic plant to produce a compound not naturally produced by that plant. This can lower 
production costs or create a novel product. 

30 In one aspect, the first step in production of a transgenic plant involves making 

an expression construct for expression in a plant cell. These techniques are well known in the 
art. They can include selecting and cloning a promoter, a coding sequence for facilitating 
efficient binding of ribosomes to mRNA and selecting the appropriate gene terminator 
sequences. One exemplary constitutive promoter is CaMV35S, from the cauliflower mosaic 
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virus, which generally results in a high degree of expression in plants. Other promoters are 
more specific and respond to cues in the plant's internal or external environment. An 
exemplary light-inducible promoter is the promoter from the cab gene, encoding the major 
chlorophyll a/b binding protein. 
5 In one aspect, the nucleic acid is modified to achieve greater expression in a 

plant cell. For example, a sequence of the invention is likely to have a higher percentage of 
A-T nucleotide pairs compared to that seen in a plant, some of which prefer G-C nucleotide 
pairs. Therefore, A-T nucleotides in the coding sequence can be substituted with G-C 
nucleotides without significantly changing the amino acid sequence to enhance production of 

10 the gene product in plant cells. 

Selectable marker gene can be added to the gene construct in order to identify 
plant cells or tissues that have successfully integrated the transgene. This may be necessary 
because achieving incorporation and expression of genes in plant cells is a rare event, 
occurring in just a few percent of the targeted tissues or cells. Selectable marker genes 

15 encode proteins that provide resistance to agents that are normally toxic to plants, such as 

antibiotics or herbicides. Only plant cells that have integrated the selectable marker gene will 
survive when grown on a medium containing the appropriate antibiotic or herbicide. As for 
other inserted genes, marker genes also require promoter and termination sequences for 
proper function. 

20 In one aspect, making transgenic plants or seeds comprises incorporating 

sequences of the invention and, optionally, marker genes into a target expression construct 
(e.g., a plasmid), along with positioning of the promoter and the terminator sequences. This 
can involve transferring the modified gene into the plant through a suitable method. For 
example, a construct may be introduced directly into the genomic DNA of the plant cell using 

25 techniques such as electroporation and microinjection of plant cell protoplasts, or the 

constructs can be introduced directly to plant tissue using ballistic methods, such as DNA 
particle bombardment For example, see, e.g., Christou (1997) Plant Mol. Biol. 35:197-203; 
Pawlowski (1996) Mol. Biotechnol. 6:17-30; Klein (1987) Nature 327:70-73; Takumi (1997) 
Genes Genet. Syst 72:63-69, discussing use of particle bombardment to introduce transgenes 

30 into wheat; and Adam (1997) supra, for use of particle bombardment to introduce YACs into 
plant cells. For example, Rinehart (1997) supra, used particle bombardment to generate 
transgenic cotton plants. Apparatus for accelerating particles is described U.S. Pat. No. 
5,015,580; and, the commercially available BioRad (Biolistics) PDS-2000 particle 



141 



WO 03/106654 PCT/US03/19153 
acceleration instrument; see also, John, U.S. Patent No. 5,608,148; and Ellis, U.S. Patent No. 
5, 681,730, describing particle-mediated transformation of gymnosperms. 

In one aspect, protoplasts can be immobilized and injected with a nucleic 
acids, e.g., an expression construct. Although plant regeneration from protoplasts is not easy 
with cereals, plant regeneration is possible in legumes using somatic embryogenesis from 
protoplast derived callus. Organized tissues can be transformed with naked DNA using gene 
gun technique, where DNA is coated on tungsten microprojectiles, shot l/100th the size of 
cells, which carry the DNA deep into cells and organelles. Transformed tissue is then induced 
to regenerate, usually by somatic embryogenesis. This technique has been successful in 
several cereal species including maize and rice. 

Nucleic acids, e.g., expression constructs, can also be introduced in to plant 
cells using recombinant viruses. Plant cells can be transformed using viral vectors, such as, 
e.g., tobacco mosaic virus derived vectors (Rouwendal (1997) Plant Mol. Biol. 33:989-999), 
see Porta (1996) "Use of viral replicons for the expression of genes in plants " Mol. 
BiotechnoL 5:209-221. 

Alternatively, nucleic acids, e.g., an expression construct, can be combined 
with suitable T-DNA flanking regions and introduced into a conventional Agrobacterium 
tumefaciens host vector. The virulence functions of the Agrobacterium tumefaciens host will 
direct the insertion of the construct and adjacent marker into the plant cell DNA when the cell 
is infected by the bacteria. Agrobacterium tumefaciens-mediated transformation techniques, 
including disarming and use of binary vectors, are well described in the scientific literature. 
See, e.g., Horsch (1984) Science 233:496-498; Fraley (1983) Proc. Natl Acad. Set USA 
80:4803 (1983); Gene Transfer to Plants, Voirykus, ed. (Springer-Verlag, Berlin 1995). The 
DNA in an A. tumefaciens cell is contained in the bacterial chromosome as well as in another 
structure known as a Ti (tumor-inducing) plasmid. The Ti plasmid contains a stretch of DNA 
termed T-DNA (-20 kb long) that is transferred to the plant cell in the infection process and a 
series of vir (virulence) genes that direct the infection process. A. tumefaciens can only infect 
a plant through wounds: when a plant root or stem is wounded it gives off certain chemical 
signals, in response to which, the vir genes of A tumefaciens become activated and direct a 
series of events necessary for the transfer of the T-DNA from the Ti plasmid to the plant's 
chromosome. The T-DNA then enters the plant cell through the wound. One speculation is 
that the T-DNA waits until the plant DNA is being replicated or transcribed, then inserts itself 
into the exposed plant DNA. In order to use A. tumefaciens as a transgene vector, the tumor- 
inducing section of T-DNA have to be removed, while retaining the T-DNA border regions 
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and the vir genes. The transgene is then inserted between the T-DNA border regions, where 
it is transferred to the plant cell and becomes integrated into the plant's chromosomes. 

The invention provides for the transformation of monocotyledonous plants 
using the nucleic acids of the invention, including important cereals, see Hiei (1997) Plant 
Mol. Biol. 35:205-218. See also, e.g., Horsch, Science (1984) 233:496; Fraley (1983) Proc. 
Natl. Acad. Sci USA 80:4803; Thykjaer (1997) supra; Park (1996) Plant Mol. Biol. 
32:1 135-1 148, discussing T-DNA integration into genomic DNA. See also DHalluin, U.S. 
Patent No. 5,712,135, describing a process for the stable integration of a DNA comprising a 
gene that is functional in a cell of a cereal, or other monocotyledonous plant. 

In one aspect, the third step can involve selection and regeneration of whole 
plants capable of transmitting the incorporated target gene to the next generation. Such 
regeneration techniques rely on manipulation of certain phytohormones in a tissue culture 
growth medium, typically relying on a biocide and/or herbicide marker that has been 
introduced together with the desired nucleotide sequences. Plant regeneration from cultured 
protoplasts is described in Evans et al., Protoplasts Isolation and Culture, Handbook of Plant 
Cell Culture, pp. 124-176, MacMillilan Publishing Company, New York, 1983; and Binding, 
Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton, 1985. 
Regeneration can also be obtained from plant callus, explants, organs, or parts thereof. Such 
regeneration techniques are described generally in Klee (1987) Ann. Rev. of Plant Phys. 
38:467-486. To obtain whole plants from transgenic tissues such as immature embryos, they 
can be grown under controlled environmental conditions in a series of media containing 
nutrients and hormones, a process known as tissue culture. Once whole plants are generated 
and produce seed, evaluation of the progeny begins. 

After the expression cassette is stably incorporated in transgenic plants, it can 
be introduced into other plants by sexual crossing. Any of a number of standard breeding 
techniques can be used, depending upon the species to be crossed. Since transgenic 
expression of the nucleic acids of the invention leads to phenotypic changes, plants 
comprising the recombinant nucleic acids of the invention can be sexually crossed with a 
second plant to obtain a final product. Thus, the seed of the invention can be derived from a 
cross between two transgenic plants of the invention, or a cross between a plant of the 
invention and another plant. The desired effects (e.g., expression of the polypeptides of the 
invention to produce a plant in which flowering behavior is altered) can be enhanced when 
both parental plants express the polypeptides (e.g., a xylanase) of the invention. The desired 
effects can be passed to future plant generations by standard propagation means. 
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The nucleic acids and polypeptides of the invention are expressed in or 
inserted in any plant or seed. Transgenic plants of the invention can be dicotyledonous or 
monocotyledonous. Examples of monocot transgenic plants of the invention are grasses, 
such as meadow grass (blue grass, Pod), forage grass such as festuca, lolium, temperate 
5 grass, such as Agrostis, and cereals, e.g., wheat, oats, rye, barley, rice, sorghum, and maize 
(corn). Examples of dicot transgenic plants of the invention are tobacco, legumes, such as 
lupins, potato, sugar beet, pea, bean and soybean, and cruciferous plants (family 
Brassicaceae), such as cauliflower, rape seed, and the closely related model organism 
Arabidopsis thaliana. Thus, the transgenic plants and seeds of the invention include a broad 
10 range of plants, including, but not limited to, species from the genera Anacardium, Arachis, 
Asparagus, Atropa, Avena, Brassica, Citrus, Citrullus, Capsicum, Carthamus, Cocos, Coflea, 
Cucumis, Cucurbita, Daucus, Elaeis, Fragaria, Glycine, Gossypium, Helianthus, 
Heterocallis, Hordeum, Hyoscyamus, Lactuca, Linum, Lolium, Lupinus, Lycopersicon, 
Malus, Manihot, Majorana, Medicago, Nicotiana, Olea, Oryza, Panieum, Pannisetum, 
15 Persea, Phaseolus, Pistachio, Pisum, Pyrus, Prunus, Raphanus, Ricinus, Secale, Senecio, 
Sinapis, Solatium, Sorghum, Theobromus, Trigonella, TrUicum, Vicia, Vitis, Vigna, andZea. 

In alternative embodiments, the nucleic acids of the invention are expressed in 
plants which contain fiber cells, including, e.g., cotton, silk cotton tree (Kapok, Ceiba 
pentandra), desert willow, creosote bush, winterfat, balsa, ramie, kenaf, hemp, roselle, jute, 
20 sisal abaca and flax. In alternative embodiments, the transgenic plants of the invention can 
be members of the genus Gossypium, including members of any Gossypium species, such as 
G. arbor aim;. G. herbaceum, G. barbadense, and G. hirsutum. 

The invention also provides for transgenic plants to be used for producing 
large amounts of the polypeptides (e.g., a xylanase or antibody) of the invention. For 
25 example, see Pahngren (1997) Trends Genet. 13:348; Chong (1997) Transgenic Res. 

6:289-296 (producing human milk protein beta-casein in transgenic potato plants using an 
auxin-inducible, bidirectional mannopine synthase (masl',2') promoter with Agrobacierium 
tumefacietis-medi&ted leaf disc transformation methods). 

Using known procedures, one of skill can screen for plants of the invention by 
30 detecting the increase or decrease of transgene mRNA or protein in transgenic plants. Means 
for detecting and quantitation ofmRNAs or proteins are well known in the art. 
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Polypeptides and peptides 

In one aspect, the invention provides isolated or recombinant polypeptides 
having a sequence identity (e.g., at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 
58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 
5 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or complete (100%) 
sequence identity) to an exemplary sequence of the invention, e.g., proteins having a 
sequence as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, 

10 SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 
NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, 
SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO.70, SEQ ID NO:72, SEQ ID NO:74, SEQ ID 

15 NO:76, SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, 
SEQ ID NO:88, SEQ ID NO:90, SEQ ID NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID 
NO:98, SEQ ID NO:100, SEQ ID NO:102, SEQ ID NO:104, SEQ ID NO:106, SEQ ID 
NO:108, SEQIDNO:110, SEQ IDNO:112, SEQ IDNO:114, SEQ ID NO: 116, SEQ ID 
NO:l 18, SEQ ID NO:120, SEQ ID NO:122, SEQ ID NO:124, SEQ ID NO:126, SEQ ID 

20 NO:128,SEQIDNO:130,SEQIDNO:132;SEQIDNO:134;SEQIDNO:136;SEQID 
NO:138; SEQ ID NO:140; SEQ ID NO:142; SEQ ID NO:144; NO:146, SEQ ID NO:148, 
SEQ ID NO:150, SEQ ID NO:152, SEQ ID NO:154, SEQ JD NO:156, SEQ ID NO:158, 
SEQ ID NO:160, SEQ ID NO:162, SEQ ID NO:164, SEQ JD NO:166, SEQ ID NO:168, 
SEQ ID NO:170, SEQ ID NO:172, SEQ ID NO:174, SEQ ID NO:176, SEQ ID NO:178, 

25 SEQ ID NO:180, SEQ ID NO:182, SEQ ID NO:184, SEQ ID NO:186, SEQ ID NO:188, 
SEQ ID NO:190, SEQ ID NO:192, SEQ ID NO:194, SEQ ID NO:196, SEQ ID NO:198, 
SEQ ID NO:200, SEQ ID NO:202, SEQ ID NO:204, SEQ ID NO.206, SEQ ID NO:208, 
SEQ ID NO:210, SEQ ID NO:212, SEQ ID NO:214, SEQ ID NO:216, SEQ ID NO:218, 
SEQ ID NO:220, SEQ ID NO:222, SEQ ID NO:224, SEQ ID NO:226, SEQ ID NO:228, 

30 SEQ ID NO.230, SEQ ID NO:232, SEQ ID NO:234, SEQ ID NO:236, SEQ ID NO:238, 
SEQ ED NO.240, SEQ ID NO:242, SEQ ID NO:244, SEQ ID NO:246, SEQ ID NO:248, 
SEQ ID NO:250, SEQ ID NO:252, SEQ ID NO:254, SEQ ID NO:256, SEQ ID NO:258, 
SEQ ID NO:260, SEQ ID NO:262, SEQ ID NO:264, SEQ ID NO:266, SEQ ID NO:268, 
SEQ ID NO:270, SEQ ID NO:272, SEQ ID NO:274, SEQ ID NO:276, SEQ ID NO:278, 
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SEQ ID NO:280, SEQ ID NO:282, SEQ ID NO:284, SEQ ID NO:286, SEQ ID NO:288, 
SEQ ID NO:290, SEQ ID NO:292, SEQ ID NO:294, SEQ ID NO:296, SEQ ID NO:298, 
SEQ ID NO:300, SEQ ID NO:302, SEQ ID NO:304, SEQ ID NO:306, SEQ ID NO:308, 
SEQ ID NO:310, SEQ ID NO:312, SEQ ID NO:314, SEQ ID NO:316, SEQ ID 140:318, 
5 SEQ ID NO:320, SEQ ID NO:322, SEQ ID NO:324, SEQ ID NO:326, SEQ ID NO:328, 
SEQ ID NO.330, SEQ ID NO:332, SEQ ID NO:334, SEQ ID NO:336, SEQ ID NO:338, 
SEQ ID NO:340, SEQ ID NO:342, SEQ ID NO:344, SEQ ID NO:346, SEQ ID NO:348, 
SEQ ID NO:350, SEQ ID NO:352, SEQ ID NO:354, SEQ ID NO:356, SEQ ID NO:358, 
SEQ ID NO:360, SEQ ID NO:362, SEQ ID NO:364, SEQ ID NO:366, SEQ ID NO:368, 

10 SEQ ID NO:370, SEQ ID NO:372, SEQ ID NO:374, SEQ ID NO:376, SEQ ID NO:378 or 
SEQ ID NO:380. In one aspect, the polypeptide has a xylanase activity, e.g., can hydrolyze a 
glycosidic bond in a polysaccharide, e.g., a xylan. In one aspect, the polypeptide has a 
xylanase activity comprising catalyzing hydrolysis of internal p-l,4-xylosidic linkages. In 
one aspect, the xylanase activity comprises an endo- 1 ,4-beta-xylanase activity. In one aspect, 

15 the xylanase activity comprises hydrolyzing a xylan to produce a smaller molecular weight 
xylose and xylo-oligomer. In one aspect, the xylan comprises an arabinoxylan, such as a 
water soluble arabinoxylan. 

The polypeptides of the invention include xylanases in an active or inactive 
foira. For example, the polypeptides of the invention include proproteins before 

20 "maturation" or processing of preprb sequences, e.g., by a proprotein-processing enzyme, 
such as a proprotein convertase to generate an "active" mature protein. The polypeptides of 
the invention include xylanases inactive for other reasons, e.g., before "activation" by a post- 
translational processing event, e.g., an endo- or exo-peptidase or proteinase action, a 
phosphorylation event, an amidation, a glycosylation or a sulfation, a dimerization event, and 

25 the like. The polypeptides of the invention include all active forms, including active 
subsequences, e.g., catalytic domains or active sites, of the xylanase. 

Methods for identifying "prepro 1 ' domain sequences and signal sequences are 
well known in the art, see, e.g., Van de Ven (1993) Crit. Rev. Oncog. 4(2):1 15-136. For 
example, to identify a prepro sequence, the protein is purified from the extracellular space 

30 and the N-tenninal protein sequence is determined and compared to the unprocessed form. 

The invention includes polypeptides with or without a signal sequence and/or 
a prepro sequence. The invention includes polypeptides with heterologous signal sequences 
and/or prepro sequences. The prepro sequence (including a sequence of the invention used as 
a heterologous prepro domain) can be located on the amino terminal or the carboxy terminal 
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end of the protein. The invention also includes isolated or recombinant signal sequences, 
prepro sequences and catalytic domains (e.g., "active sites'*) comprising sequences of the 
invention. 

The percent sequence identity can be over the full length of the polypeptide, 
5 or, the identity can be over a region of at least about 50, 60, 70, 80, 90, 100, 150, 200, 250, 
300, 350, 400, 450, 500, 550, 600, 650, 700 or more residues. Polypeptides of the invention 
can also be shorter than the full length of exemplary polypeptides. In alternative aspects, the 
invention provides polypeptides (peptides, fragments) ranging in size between about 5 and 
the full length of a polypeptide, e.g., an enzyme, such as a xylanase; exemplary sizes being of 

10 about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 100, 125, 150, 175, 
200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, or more residues, e.g., contiguous 
residues of an exemplary xylanase of the invention. 

Peptides of the invention (e.g., a subsequence of an exemplary polypeptide of 
the invention) can be useful as, e.g., labeling probes, antigens, toleragens, motifs, xylanase 

15 active sites (e.g., "catalytic domains")* signal sequences and/or prepro domains. 

Polypeptides and peptides of the invention can be isolated from natural 
sources, be synthetic, or be recombinantly generated polypeptides. Peptides and proteins can 
be recombinantly expressed in vitro or in vivo. The peptides and polypeptides of the 
invention can be made and isolated using any method known in the art. Polypeptide and 

20 peptides of the invention can also be synthesized, whole or in part, using chemical methods 
well known in the art. See e.g., Caruthers (1980) Nucleic Acids Res. Symp. Ser. 215-223; 
Horn (1980) Nucleic Acids Res. Symp. Ser. 225-232; Banga, A.K., Therapeutic Peptides and 
Proteins, Formulation, Processing and Delivery Systems (1995) Technomic Publishing Co., 
Lancaster, PA. For example, peptide synthesis can be performed using various solid-phase 

25 techniques (see e.g., Roberge (1995) Science 269:202; Merrifield (1997) Methods Enzymol. 
289:3-13) and automated synthesis may be achieved, e.g., using the ABI 431 A Peptide 
Synthesizer (Perkin Elmer) in accordance with the instructions provided by the manufacturer. 

The peptides and polypeptides of the invention can also be glycosylated. The 
glycosylation can be added post-translationally either chemically or by cellular biosynthetic 

30 mechanisms, wherein the later incorporates the use of known glycosylation motifs, which can 
be native to the sequence or can be added as a peptide or added in the nucleic acid coding 
sequence. The glycosylation can be O-linked or N-linked. 

The peptides and polypeptides of the invention, as defined above, include all 
"mimetic" and "peptidomimetic" forms. The terms "mimetic" and'"peptidomimetic" refer to 
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a synthetic chemical compound which has substantially the same structural and/or functional 
characteristics of the polypeptides of the invention. The mimetic can be either entirely 
composed of synthetic, non-natural analogues of amino acids, or, is a chimeric molecule of 
partly natural peptide amino acids and partly non-natural analogs of amino acids. The 
5 mimetic can also incorporate any amount of natural amino acid conservative substitutions as 
long as such substitutions also do not substantially alter the mimetic' s structure and/or 
activity. As with polypeptides of the invention which are conservative variants, routine 
experimentation will determine whether a mimetic is within the scope of the invention, i.e., 
that its structure and/or function is not substantially altered. Thus, in one aspect, a mimetic 
10 composition is within the scope of the invention if it has a xylanase activity. 

Polypeptide mimetic compositions of the invention can contain any 
combination of non-natural structural components. In alternative aspect, mimetic 
compositions of the invention include one or all of the following three structural groups: a) 
residue linkage groups other than the natural amide bond ('^peptide bond") linkages; b) non- 
15 natural residues in place of naturally occurring amino acid residues; or c) residues which 
induce secondary structural mimicry, i.e., to induce or stabilize a secondary structure, e.g., a 
beta turn, gamma turn, beta sheet, alpha helix conformation, and the like. For example, a 
polypeptide of the invention can be characterized as a mimetic when all or some of its 
residues are joined by chemical means other than natural peptide bonds. Individual 
20 peptidomimetic residues can be joined by peptide bonds, other chemical bonds or coupling 
means, such as, e.g., glutaraldehyde, N-hydroxysuccinimide esters, Afunctional maleimides, 
NjN'-dicyclohexylcarbodiimide (DCC) or N,N'-diisopropylcarbodiimide (DIC). Linking 
groups that can be an alternative to the traditional amide bond ("peptide bond") linkages 
include, e.g., ketomethylene (e.g., -C(=0)-CH2- for -C(=0)-NH-), aminomethylene (CH2- 
25 NH), ethylene, olefin (CH=CH), ether (CH 2 -0), thioether (CH 2 -S), tetrazole (CN 4 -), thiazole, 
retroamide, thioamide, or ester (see, e.g., Spatola (1983) in Chemistry and Biochemistry of 
Amino Acids, Peptides and Proteins, Vol 7, pp 267-357, "Peptide Backbone Modifications," 
MarceUDekker,NY). 

A polypeptide of the invention can also be characterized as a mimetic by 
30 containing all or some non-natural residues in place of naturally occurring amino acid 

residues. Non-natural residues are well described in the scientific and patent literature; a few 
exemplary non-natural compositions useful as mimetics of natural amino acid residues and 
guidelines are described below. Mimetics of aromatic amino acids can be generated by 
replacing by, e.g., D- or L- naphylalanine; D- or L- phenylglycine; D- or L-2 thieneylalanine; 
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D- or L-l, -2, 3-, or 4- pyreneylalanine; D- or L-3 thieneylalanine; D- or L-(2-pyridinyl)- 
alanine; D- or L-(3-pyridinyl)-alanine; D- or I^(2-pyrazinyl>alanine; D- or L-(4-isopropyl)- 
phenylglycine; D-(trifluoromethyl)-phenylglycine; D-(trifluoromethyl)-phenylalanine; D-p- 
fluoro-phenylalanine; D- or L-p-biphenylphenylalanine; D- or L-p-methoxy- 
5 biphenylphenylalanine; D- or L-2-indole(alkyl)alanines; and, D- or I^alkylainines, where 
alkyl can be substituted or unsubstituted methyl, ethyl, propyl, hexyl, butyl, pentyl, isopropyl, 
iso-butyl, sec-isotyi, iso-pentyl, or a non-acidic amino acids. Aromatic rings of a non-natural 
amino acid include, e.g., thiazolyl, thiophenyl, pyrazolyl, benzimidazolyl, naphthyl, furanyl, 
pyrrolyl, and pyridyl aromatic rings. 

10 Mimetics of acidic amino acids can be generated by substitution by, e.g., non- 

carboxylate amino acids while maintaining a negative charge; (phosphono)alanine; sulfated 
threonine. Carboxyl side groups (e.g., aspartyl or glutamyl) can also be selectively modified 
by reaction with carbodiimides (R'-N-C-N-R 5 ) such as, e.g., l-cyclohexyl-3(2-morpholinyl- 
(4-ethyl) carbodiimide or l-ethyl-3(4-azonia- 4,4- dimetholpentyl) cafbodiimide. Aspartyl or 

1 5 glutamyl can also be converted to asparaginyl and glutaminyl residues by reaction with 

ammonium ions. Mimetics of basic amino acids can be generated by substitution with, e.g., 
(in addition to lysine and arginine) the amino acids ornithine, citrulline, or (guanidino)-acetic 
acid, or (guanidino)alkyl-acetic acid, where alkyl is defined above. Nitrile derivative (e.g., 
containing the CN-moiety in place of COOH) can be substituted for asparagine or glutamine. 

20 Asparaginyl and glutaminyl residues can be deaminated to the corresponding aspartyl or 

glutamyl residues. Arginine residue mimetics can be generated by reacting arginyl with, e.g., 
one or more conventional reagents, including, e.g., phenylglyoxal, 2,3-butanedione, 1,2- 
cyclo-hexanedione, or ninhydrin, preferably under alkaline conditions. Tyrosine residue 
mimetics can be generated by reacting tyrosyl with, e.g., aromatic diazonium compounds or 

25 tetranitromethane. N-acetylimidizol and tetranitromethane can be used to form O-acetyl 
tyrosyl species and 3-nitro derivatives, respectively. Cysteine residue mimetics can be 
generated by reacting cysteinyl residues with, e.g., alpha-haloacetates such as 2-chloroacetic 
acid or chloroacetamide and corresponding amines; to give carboxymethyl or 
carboxyamidomethyl derivatives. Cysteine residue mimetics can also be generated by 

30 reacting cysteinyl residues with, e.g., bromo-trifluoroacetone, alpha-bromo-beta-(5- 

imidozoyl) propionic acid; chloroacetyl phosphate, N-alkylmaleimides, 3-nitro-2-pyridyl 
disulfide; methyl 2-pyridyl disulfide; p-chloromercuribenzoate; 2-chloromercuri-4 
nitrophenol; or, chloro-7-nitrobenzo-oxa-l,3-diazole. Lysine mimetics can be generated (and 
amino terminal residues can be altered) by reacting lysinyl with, e.g., succinic or other 
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carboxylic acid anhydrides. Lysine and other alpha-ammo-containing residue mimetics can 
also be generated by reaction with imidoesters, such as methyl picolinimidate, pyridoxal 
phosphate, pyridoxal, chloroborohydride, trinitro-benzenesulfonic acid, O-methylisourea, 2,4, 
pentanedione, and transamidase-catalyzed reactions with glyoxylate. Mimetics of methionine 
5 can be generated by reaction with, e.g., methionine sulfoxide. Mimetics of proline include, 
e.g., pipecolic acid, thiazolidine carboxylic acid, 3- or 4- hydroxy proline, dehydroproline, 3- 
or 4-methylproline, or 3,3,-dimethylproline. Histidine residue mimetics can be generated by 
reacting histidyl with, e.g., diethylprocarbonate or para-bromophenacyl bromide. Other 
mimetics include, e.g., those generated by hydroxylation of proline and lysine; 
10 phosphorylation of the hydroxyl groups of seryl or threonyl residues; methylation of the 
alpha-amino groups of lysine, arginine and histidine; acetylation of the N-terminal amine; 
methylation of main chain amide residues or substitution with N-methyl amino acids; or 
amidation of C-terminal carboxyl groups. 

A residue, e.g., an amino acid, of a polypeptide of the invention can also be 
15 replaced by an amino acid (or pepudomimetic residue) of the opposite chirality. Thus, any 
amino acid naturally occurring in the L-configuration (which can also be referred to as the R 
or S, depending upon the structure of the chemical entity) can be replaced with the amino 
acid of the same chemical structural type or apeptidomimetic, but of the opposite chirality, 
referred to as the D- amino acid, but also can be referred to as the R- or S- form. 
20 The invention also provides methods for modifying the polypeptides of the 

invention by either natural processes, such as post-translational processing (e.g., 
phosphorylation, acylation, etc), or by chemical modification techniques, and the resulting 
modified polypeptides. Modifications can occur anywhere in the polypeptide, including the 
peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be 
25 appreciated that the same type of modification may be present in the same or varying degrees 
at several sites in a given polypeptide. Also a given polypeptide may have many types of 
modifications. Modifications include acetylation, acylation, ADP-ribosylation, amidation, 
covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of 
a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, 
30 covalent attachment of a phosphatidylinositol, cross-linking cyclization, disulfide bond 

formation, demetliylation, formation of covalent cross-links, formation of cysteine, formation 
of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, 
hydroxylation, iodination, methylation, myristolyation, oxidation, pegylation, proteolytic 
processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, and transfer- 
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RNA mediated addition of amino acids to protein such as arginylation. See, e.g., Creighton, 
T.E., Proteins - Structure and Molecular Properties 2nd Ed. 9 W.H. Freeman and Company, 
New York (1993); Posttranslational Covalent Modification of Proteins, B.C. Johnson, Ed., 
Academic Press, New York, pp. 1-12 (1983). 
5 Solid-phase chemical peptide synthesis methods can also be used to synthesize 

the polypeptide or fragments of the invention. Such method have been known in the art since 
the early 1960's (Merrifield, R. B., J. Am. Chem. Soc, 85:2149-2154, 1963) (See also 
Stewart, J. M. and Young, J. D., Solid Phase Peptide Synthesis, 2nd Ed., Pierce Chemical 
Co., Rockford, 111., pp. 1 1-12)) and have recently been employed in commercially available 
10 laboratory peptide design and synthesis kits (Cambridge Research Biochemicals). Such 

commercially available laboratory kits have generally utilized the teachings of H. M. Geysen 
et al, Proc. Natl. Acad. Sci, USA, 81 :3998 (1984) and provide for synthesizing peptides upon 
the tips of amultitude of "rods" or "pins" all of which are connected to a single plate. When 
such a system is utilized, a plate of rods or pins is inverted and inserted into a second plate of 
15 corresponding wells or reservoirs, which contain solutions for attaching or anchoring an 
appropriate amino acid to the pin's or rod's tips. By repeating such aprocess step, i.e., 
inverting and inserting the rod's and pin's tips into appropriate solutions, amino acids are built 
into desired peptides. In addition, a number of available FMOC peptide synthesis systems 
are available. For example, assembly of a polypeptide or fragment can be carried out on a 
20 solid support using an Applied Biosystems, Inc. Model 431 A™ automated peptide 

synthesizer. Such equipment provides ready access to the peptides of the invention, either by 
direct synthesis or by synthesis of a series of fragments that can be coupled using other 
known techniques. 

The invention includes xylanases of the invention with and without signal. 
25 The polypeptide comprising a signal sequence of the invention can be a xylanase of the 
invention or another xylanase or another enzyme or other polypeptide. 

The invention includes immobilized xylanases, anti-xylanase antibodies and 
fragments thereof. The invention provides methods for inhibiting xylanase activity, e.g., 
using dominant negative mutants or anti-xylanase antibodies of the invention. The invention 
30 includes heterocomplexes, e.g., fusion proteins, heterodimers, etc., comprising the xylanases 
of the invention. 

Polypeptides of the invention can have a xylanase activity under various 
conditions, e.g., extremes in pH and/or temperature, oxidizing agents, and the like. The 
invention provides methods leading to alternative xylanase preparations with different 
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catalytic efficiencies and stabilities, e.g., towards temperature, oxidizing agents and changing 
wash conditions. In one aspect, xylanase variants can be produced using techniques of site- 
directed mutagenesis and/or random mutagenesis. In one aspect, directed evolution can be 
used to produce a great variety of xylanase variants with alternative specificities and stability. 

5 The proteins of the invention are also useful as research reagents to identify 

xylanase modulators, e.g., activators or inhibitors of xylanase activity. Briefly, test samples 
(compounds, broths, extracts, and the like) are added to xylanase assays to determine their 
ability to inhibit substrate cleavage. Inhibitors identified in this way can be used in industry 
and research to reduce or prevent undesired proteolysis. As with xylanases, inhibitors can be 

1 0 combined to increase the spectrum of activity. 

The enzymes of the invention are also useful as research reagents to digest 
proteins or in protein sequencing. For example, the xylanases may be used to break 
polypeptides into smaller fragments for sequencing using, e.g. an automated sequencer. 

The invention also provides methods of discovering new xylanases using the 

15 nucleic acids, polypeptides and antibodies of the invention. In one aspect, phagemid libraries 
are screened for expression-based discovery of xylanases. In another aspect, lambda phage 
libraries are screened for expression-based discovery of xylanases. Screening of the phage or 
phagemid libraries can allow the detection of toxic clones; improved access to substrate; 
reduced need for engineering a host, by-passing the potential for any bias resulting from mass ,. 

20 excision of the library; and, faster growth at low clone densities. Screening of phage or 
phagemid libraries can be in liquid phase or in solid phase. In one aspect, the invention 
provides screening in liquid phase. This gives a greater flexibility in assay conditions; 
additional substrate flexibility; higher sensitivity for weak clones; and ease of automation 
over solid phase screening. 

25 The invention provides screening methods using the proteins and nucleic acids 

of the invention and robotic automation to enable the execution of many thousands of ■ 
biocatalytic reactions and screening assays in a short period of time, e.g., per day, as well as 
ensuring a high level of accuracy and reproducibility (see discussion of arrays, below). As a 
result, a library of derivative compounds can be produced in a matter of weeks. For further 

30 teachings on modification of molecules, including small molecules, see PCT/US94/09174. 

Another aspect of the invention is an isolated or purified polypeptide 
comprising the sequence of one of Group A nucleic acid sequences and sequences 
substantially identical thereto, or fragments comprising at least about 5, 10, 15, 20, 25, 30, 35, 
40, 50, 75, 100, or 150 consecutive amino acids thereof. As discussed above, such 
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polypeptides may be obtained by inserting a nucleic acid encoding the polypeptide into a 
vector such that the coding sequence is operably linked to a sequence capable of driving the 
expression of the encoded polypeptide in a suitable host cell. For example, the expression 
vector may comprise a promoter, a ribosome binding site for translation initiation and a 
5 transcription terminator. The vector may also include appropriate sequences for amplifying 
expression. 

Another aspect of the invention is polypeptides or fragments thereof which 
have at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least 
about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at 
10 least about 95%, or more than about 95% homology to one of the polypeptides of Group B 
amino acid sequences and sequences substantially identical thereto, or a fragment comprising 
at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof. 
Homology may be determined using any of the programs described above which aligns the 
polypeptides or fragments being compared and determines the extent of amino acid identity 
1 5 or similarity between them. It will be appreciated that amino acid "homology" includes 
conservative amino acid substitutions such as those described above. 

The polypeptides or fragments having homology to one of the polypeptides of 
Group B amino acid sequences and sequences substantially identical thereto, or a fragment 
comprising at least about 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino 
20 acids thereof may be obtained by isolating the nucleic acids encoding them using the 
techniques described above. 

Alternatively, the homologous polypeptides or fragments may be obtained 
through biochemical enrichment or purification procedures. The sequence of potentially 
homologous polypeptides or fragments may be determined by xylan hydrolase digestion, gel 
25 electrophoresis and/or microsequencing. The sequence of the prospective homologous 

polypeptide or fragment can be compared to one of the polypeptides of Group B arnino acid 
sequences and sequences substantially identical thereto, or a fragment comprising at least 
about 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof using 
any of the programs described above. 
30 Another aspect of the invention is an assay for identifying fragments or 

variants of Group B amino acid sequences and sequences substantially identical thereto, 
which retain the enzymatic function of the polypeptides of Group B amino acid sequences 
and sequences substantially identical thereto. For example the fragments or variants of said 
polypeptides, may be used to catalyze biochemical reactions, which indicate that the fragment 
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or variant retains the enzymatic activity of the polypeptides in the Group B amino acid 
sequences. 

The assay for determining if fragments of variants retain the enzymatic 
activity of the polypeptides of Group B amino acid sequences and sequences substantially 
5 identical thereto includes the steps of: contacting the polypeptide fragment or variant with a 
substrate molecule under conditions which allow the polypeptide fragment or variant to 
function and detecting either a decrease in the level of substrate or an increase in the level of 
the specific reaction product of the reaction between the polypeptide and substrate. 

The polypeptides of Group B amino acid sequences and sequences 
10 substantially identical thereto or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 
75, 100, or 150 consecutive amino acids thereof may be used in a variety of applications. For 
example, the polypeptides or fragments thereof may be used to catalyze biochemical 
reactions. In accordance with one aspect of the invention, there is provided a process for 
utilizing the polypeptides of Group B amino acid sequences and sequences substantially 
15 identical thereto or polynucleotides encoding such polypeptides for hydrolyzing glycosidic 
linkages. In such procedures, a substance containing a glycosidic linkage (e.g., a starch) is 
contacted with one of the polypeptides of Group B amino acid sequences, or sequences 
substantially identical thereto under conditions which facilitate the hydrolysis of the 
glycosidic linkage. 

20 The present invention exploits the unique catalytic properties of enzymes. 

Whereas the use of biocatalysts (i.e., purified or crude enzymes, non-living or living cells) in 
chemical transformations normally requires the identification of a particular biocatalyst that 
reacts with a specific starting compound, the present invention uses selected biocatalysts and 
reaction conditions that are specific for functional groups that are present in many starting 

25 compounds, such as small molecules. Each biocatalyst is specific for one functional group, 
or several related functional groups and can react with many starting compounds containing 

this functional group. 

The biocatalytic reactions produce a population of derivatives from a single 
starting compound. These derivatives can be subjected to another round of biocatalytic 
30 reactions to produce a second population of derivative compounds. Thousands of variations 
of the original small molecule or compound can be produced with each iteration of 
biocatalytic derivatization. 

Enzymes react at specific sites of a starting compound without affecting the 
rest of the molecule, a process which is very difficult to achieve using traditional chemical 
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methods. This high degree of biocatalytic specificity provides the means to identify a single 
active compound within the library. The library is characterized by the series of biocatalytic 
reactions used to produce it, a so called "biosynthetic history". Screening the library for 
biological activities and tracing the biosynthetic history identifies the specific reaction 
5 sequence producing the active compound. The reaction sequence is repeated and the structure 
of the synthesized compound determined. This mode of identification, unlike other synthesis 
and screening approaches, does not require immobilization technologies and compounds can 
be synthesized and tested free in solution using virtually any type of screening assay. It is 
important to note, that the high degree of specificity of enzyme reactions on functional 
10 groups allows for the "tracking" of specific enzymatic reactions that make up the 
biocatalytically produced library. 

Many of the procedural steps are performed using robotic automation enabling 
the execution of many thousands of biocatalytic reactions and screening assays per day as 
well as ensuring a high level of accuracy and reproducibility. As a result, a library of 
15 derivative compounds can be produced in a matter of weeks which would take years to 
produce using current chemical methods. 

In a particular aspect, the invention provides a method for modifying small 
molecules, comprising contacting a polypeptide encoded by a polynucleotide described 
herein or enzymatically active fragments thereof with a small molecule to produce a 
20 modified small molecule. A library of modified small molecules is tested to determine if a 
modified small molecule is present within the library which exhibits a desired activity. A 
specific biocatalytic reaction which produces the modified small molecule of desired activity 
is identified by systematically eliminating each of the biocatalytic reactions used to produce a 
portion of the library and then testing the small molecules produced in the portion of the 
25 library for the presence or absence of the modified small molecule with the desired activity. 
The specific biocatalytic reactions which produce the modified small molecule of desired 
activity is optionally repeated. The biocatalytic reactions are conducted with a group of 
biocatalysts that react with distinct structural moieties found within the structure of a small 
molecule, each biocatalyst is specific for one structural moiety or a group of related structural 
30 moieties; and each biocatalyst reacts with many different small molecules which contain the 
distinct structural moiety. 
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Xylanase signal sequences, prepro and catalytic domains 

The invention provides xylanase signal sequences (e.g., signal peptides (SPs)), 
prepro domains and catalytic domains (CDs). The SPs, prepro domains and/or CDs of the 
invention can be isolated or recombinant peptides or can be part of a fusion protein, e.g., as a 
5 heterologous domain in a chimeric protein. The invention provides nucleic acids encoding 
these catalytic domains (CDs), prepro domains and signal sequences (SPs, e.g., a peptide 
having a sequence comprising/ consisting of amino terminal residues of a polypeptide of the 
invention). In one aspect, the invention provides a signal sequence comprising a peptide 
comprising/ consisting of a sequence as set forth in residues 1 to 15, 1 to 16, 1 to 17, 1 to 18, 
10 1 to 19, 1 to 20, 1 to 21, 1 to 22, 1 to 23, 1 to 24, 1 to 25, 1 to 26, 1 to 27, 1 to 28, 1 to 28, 1 to 
30, 1 to 31, 1 to 32, 1 to 33, 1 to 34, 1 to 35, 1 to 36, 1 to 37, 1 to 38, 1 to 39, 1 to 40, 1 to 41, 
1 to 42, 1 to 43, 1 to 44 of a polypeptide of the invention. 

In one aspect, the invention provides a signal sequence comprising a peptide 
comprising/ consisting of a sequence as set forth in Table 4 below. For example, in reading 
15 Table 4, the invention provides a signal sequence comprising/ consisting of residues 1 to 23 
of SEQ ID NO: 1 02 (encoded by SEQ ID NO: 1 01 ), a signal sequence comprising/ consisting 
of residues 1 to 41 of SEQ ID NO:104 (encoded by SEQ ID NO:103), etc. 

Table 4: exemplary signal sequences of the invention 

Signal 
sequence 
(amino acid 

SEQ ID NO: positions) 

101,102 1-23 

103,104 1-41 

105,106 1-22 

109,110 1-26 

11,12 1-28 

113,114 1-28 

119,120 1-33 

121,122 1-20 

123,124 1-20 

131,132 1-26 

135, 136 1-25 

139,140 1-24 

141,142 1-25 

143,144 1-32 

147,148 1-28 

149,150 1-18 

15,16 1-20 

151,152 1-21 

153,154 1-16 

155,156 1-21 
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157, 158 


1-29 


159, 160 


1-23 


161, 162 


1-32 


163, 164 


1-26 


165, 166 


1-23 


167, 168 


1-36 


169, 170 


1-24 


17, 18 


1-31 


171, 172 


1-29 


173, 174 


1-22 


175, 176 


1-27 


177, 178 


1-26 


179, 180 


1-19 


181, 182 


1-25 


183, 184 


1-32 


185, 186 


1-27 


187, 188 


1-28 


19,20 


1-29 


191, 192 


1-27 


193, 194 


1-21 


195, 196 


1-23 


197, 198 


1-28 


199, 200 


1-30 


203, 204 


1-30 


205,206 


1-29 


207, 208 


1-27 


209, 210 


1-25 


21,22 


1-28 


211,212 


1-29 


215,216 


1-31 


217,218 


1-29 


219,220 


1-23 


221, 222 


1-24 


223, 224 


1-28 


225,226 


1-25 


227, 228 


1-39 


229, 230 


1-28 


23,24 


1-29 


231,232 


1-41 


233, 234 


1-26 


235, 236 


1-28 


237, 238 


1-32 


239,240 


1-30 


241, 242 


1-28 


243, 244 


1-33 


245, 246 


1-32 


249, 250 


1-33 


253, 254 


1-24 


255, 256 


1-51 


259, 260 


1-24 


261, 262 


1-26 


263,264 


1-29 
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267,268 


1-30 


27,28 


1-27 


271,272 


1-22 


273, 274 


1-74 


277,278 


1-19 


279, 280 


1-22 


283,284 


1-28 


287,288 


1-23 


289,290 


1-22 


295,296 


1-26 


299,300 


1-24 


301, 302 


1-28 


303, 304 


1-74 


305, 306 


1-32 


309,310 


1-20 


311,312 


1-33 


313,314 


1-22 


315,316 


1-28 


319,320 


1-27 


325, 326 


1-27 


327, 328 


1-29 


329, 330 


1-35 


33,34 


1-23 


331,332 


1-28 


333,334 


1-30 


335,336 


1-50 


339,340 


1-23 


341,342 


1-45 


347,348 


1-20 


349, 350 


1-20 


351, 352 


1-73 


353,354 


1-18 


355, 356 


1-21 


357, 358 


1-25 


359,360 


1-31 


361,362 


1-26 


365,366 


1-65 


367, 368 


1-23 


369, 370 


1-27 


39,40 


1-24 


41,42 


1-37 


45,46 


1-25 


47,48 


1-26 


5,6 


1-47 


51,52 


1-30 


53,54 


1-37 


55,56 


1-24 


57, 58 


1-22 


59,60 


1-21 


63,64 


1-20 


65,66 


1-22 


67,68 


1-28 
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69,70 


1-25 


7,8 


1-57 


73,74 


1-21 


75,76 


1-22 


77,78 


1-27 


79,80 


1-36 


83,84 


1-30 


87,88 


1-29 


89,90 


1-40 


9, 10 


1-36 


95, 96 


1-24 


99, 100 


1-33 



The xylanase signal sequences (SPs) and/or prepro sequences of the invention 
can be isolated peptides, or, sequences joined to another xylanase or a non-xylanase 
polypeptide, e.g., as a fusion (chimeric) protein. In one aspect, the invention provides 
5 polypeptides comprising xylanase signal sequences of the invention. In one aspect, 
polypeptides comprising xylanase signal sequences SPs and/or prepro of the invention 
comprise sequences heterologous to a xylanase of the invention (e.g., a fusion protein 
comprising an SP and/or prepro of the invention and sequences from another xylanase or a 
non-xylanase protein). In one aspect, the invention provides xylanases of the invention with 

10 heterologous SPs and/or prepro sequences, e.g., sequences with a yeast signal sequence. A 
xylanase of the invention can comprise a heterologous SP and/or prepro in a vector, e.g., a 
pPIC series vector (Invitrogen, Carlsbad, CA). 

In one aspect, SPs and/or prepro sequences of the invention are identified 
following identification of novel xylanase polypeptides. The pathways by which proteins are 

1 5 sorted and transported to their proper cellular location are often referred to as protein 

targeting pathways. One of the most important elements in all of these targeting systems is a 
short amino acid sequence at the amino terminus of a newly synthesized polypeptide called 
the signal sequence. This signal sequence directs a protein to its appropriate location in the 
cell and is removed during transport or when the protein reaches its final destination. Most 

20 lysosomal, membrane, or secreted proteins have an ammo-terminal signal sequence that 
marks them for translocation into the lumen of the endoplasmic reticulum. More than 1 00 
signal sequences for proteins in this group have been determined. The signal sequences can 
vary in length from 13 to 36 amino acid residues. Various methods of recognition of signal 
sequences are known to those of skill in the art. For example, in one aspect, novel xylanase 

25 signal peptides are identified by a method referred to as SignalP. SignalP uses a combined 
neural network which recognizes both signal peptides and their cleavage sites. (Nielsen, et 
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aL, "Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites." Protein Engineering, vol. 10, no. 1, p. 1-6 (1997). 

It should be understood that in some aspects xylanases of the invention may 
not have SPs and/or prepro sequences, or "domains." In one aspect, the invention provides 
• 5 the xylanases of the invention lacking all or part of an SP and/or a prepro domain. In one 

aspect, the invention provides a nucleic acid sequence encoding a signal sequence (SP) and/or 
prepro from one xylanase operably linked to a nucleic acid sequence of a different xylanase 
or, optionally, a signal sequence (SPs) and/or prepro domain from a non-xylanase protein 
may be desired. 

10 The invention also provides isolated or recombinant polypeptides comprising 

signal sequences (SPs), prepro domain and/or catalytic domains (CDs) of the invention and 
heterologous sequences. The heterologous sequences are sequences not naturally associated 
(e.g., to a xylanase) with an SP, prepro domain and/or CD. The sequence to which the SP, 
prepro domain and/or CD are not naturally associated can be on the SP's, prepro domain 

1 5 and/or CD's amino terminal end, carboxy terminal end, and/or on both ends of the SP and/or 
CD. In one aspect, the invention provides an isolated or recombinant polypeptide comprising 
(or consisting of) a polypeptide comprising a signal sequence (SP), prepro domain and/or 
catalytic domain (CD) of the invention with the proviso that it is not associated with any 
sequence to which it is naturally associated (e.g., a xylanase sequence). Similarly in one 

20 aspect, the invention provides isolated or recombinant nucleic acids encoding these 

polypeptides. Thus, in one aspect, the isolated or recombinant nucleic acid of the invention 
comprises coding sequence for a signal sequence (SP), prepro domain and/or catalytic 
domain (CD) of the invention and a heterologous sequence (i.e., a sequence not naturally 
associated with the a signal sequence (SP), prepro domain and/or catalytic domain (CD) of 

25 the invention). The heterologous sequence can be on the 3' terminal end, 5* tenninal end, 
and/or on both ends of the SP, prepro domain and/or CD coding sequence. 

Hybrid (chimeric) xylanases and peptide libraries 

In one aspect, the invention provides hybrid xylanases and fusion proteins, 
including peptide libraries, comprising sequences of the invention. The peptide libraries of 
30 the invention can be used to isolate peptide modulators (e.g., activators or inhibitors) of 
targets, such as xylanase substrates, receptors, enzymes. The peptide libraries of the 
invention can be used to identify formal binding partners of targets, such as ligands, e.g., 
cytokines, hormones and the like. In one aspect, the invention provides chimeric proteins 
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comprising a signal sequence (SP), prepro domain and/or catalytic domain (CD) of the 
invention or a combination thereof and a heterologous sequence (see above). 

In one aspect, the fusion proteins of the invention (e.g., the peptide moiety) are 
conformationally stabilized (relative to linear peptides) to allow a higher binding affinity for 
5 targets. The invention provides fusions of xylanases of the invention and other peptides, 
including known and random peptides. They can be fused in such a manner that the structure 
of the xylanases is not significantly perturbed and the peptide is metabolically or structurally 
conformationally stabilized. This allows the creation of a peptide library that is easily 
monitored both for its presence within cells and its quantity. 

10 Amino acid sequence variants of the invention can be characterized by a 

predetermined nature of the variation, a feature that sets them apart from a naturally 
occurring form, e.g., an allelic or interspecies variation of a xylanase sequence. In one 
aspect, the variants of the invention exhibit the same qualitative biological activity as the 
naturally occurring analogue. Alternatively, the variants can be selected for having modified 

1 5 characteristics. In one aspect, while the site or region for introducing an amino acid sequence 
variation is predetermined, the mutation per se need not be predetermined. For example, in 
order to optimize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed xylanase variants screened for the 
optimal combination of desired activity. Techniques for making substitution mutations at 

20 predetermined sites in DNA having a known sequence are well known, as discussed herein 
for example, Ml 3 primer mutagenesis and PCR mutagenesis. Screening of the mutants can 
be done using, e.g., assays of xylan hydrolysis. In alternative aspects, amino acid 
substitutions can be single residues; insertions can be on the order of from about 1 to 20 
amino acids, although considerably larger insertions can be done. Deletions can range from 

25 about 1 to about 20, 30, 40, 50, 60, 70 residues or more. To obtain a final derivative with the 
optimal properties, substitutions, deletions, insertions or any combination thereof may be 
used. Generally, these changes are done on a few amino acids to minimize the alteration of 
the molecule. However, larger changes may be tolerated in certain circumstances. 

The invention provides xylanases where the structure of the polypeptide 

30 backbone, the secondary or the tertiary structure, e.g., an alpha-helical or beta-sheet structure, 
has been modified. In one aspect, the charge or hydrophobicity has been modified. In one 
aspect, the bulk of a side chain has been modified. Substantial changes in function or 
immunological identity are made by selecting substitutions that are less conservative. For 
example, substitutions can be made which more significantly affect: the structure of the 
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polypeptide backbone in the area of the alteration, for example a alpha-helical or a beta-sheet 
structure; a charge or a hydrophobic site of the molecule, which can be at an active site; or a 
side chain. The invention provides substitutions in polypeptide of the invention where (a) a 
hydrophilic residues, e.g. seryl or threonyl, is substituted for (or by) a hydrophobic residue, 
5 e.g. leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for 
(or by) any other residue; (c) a residue having an electropositive side chain, e.g. lysyl, 
arginyl, or histidyl, is substituted for (or by) an electronegative residue, e.g. glutamyl or 
aspartyl; or (d) a residue having a bulky side chain, e.g. phenylalanine, is substituted for (or 
by) one not having a side chain, e.g. glycine. The variants can exhibit the same qualitative 

1 0 biological activity (i.e. xylanase activity) although variants can be selected to modify the 
characteristics of the xylanases as needed. 

In one aspect, xylanases of the invention comprise epitopes or purification 
tags, signal sequences or other fusion sequences, etc. In one aspect, the xylanases of the 
invention can be fused to a random peptide to form a fusion polypeptide. By "fused" or 

1 5 "operably linked" herein is meant that the random peptide and the xylanase are linked 
together, in such a manner as to minimize the disruption to the stability of the xylanase 
structure, e.g., it retains xylanase activity. The fusion polypeptide (or fusion polynucleotide 
encoding the fusion polypeptide) can comprise further components as well, including 
multiple peptides at multiple loops. 

20 In one aspect, the peptides and nucleic acids encoding them are randomized, 

either fully randomized or they are biased in their randomization, e.g. in nucleotide/residue 
frequency generally or per position. "Randomized" means that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. In one aspect, the 
nucleic acids which give rise to the peptides can be chemically synthesized, and thus may 

25 incorporate any nucleotide at any position. Thus, when the nucleic acids are expressed to 
form peptides, any amino acid residue may be incorporated at any position. The synthetic 
process can be designed to generate randomized nucleic acids, to allow the formation of all or 
most of the possible combinations over the length of the nucleic acid, thus forming a library 
of randomized nucleic acids. The library can provide a sufficiently structurally diverse 

30 population of randomized expression products to affect a probabilistically sufficient range of 
cellular responses to provide one or more cells exhibiting a desired response. Thus, the 
invention provides an interaction library large enough so that at least one of its members will 
have a structure that gives it affinity for some molecule, protein, or other factor. 
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Xylanases are multidomain enzymes that consist optionally of a signal 
peptide, a carbohydrate binding module, a xylanase catalytic domain, a linker and/or another 
catalytic domain. 

The invention provides a means for generating chimeric polypeptides which 
5 may encode biologically active hybrid polypeptides (e.g., hybrid xylanases). In one aspect, 
the original polynucleotides encode biologically active polypeptides. The method of the 
invention produces new hybrid polypeptides by utilizing cellular processes which integrate 
the sequence of the original polynucleotides such that the resulting hybrid polynucleotide 
encodes a polypeptide demonstrating activities derived from the original biologically active 

1 0 polypeptides. For example, the original polynucleotides may encode a particular enzyme 
from different microorganisms. An enzyme encoded by a first polynucleotide from one 
organism or variant may, for example, function effectively under a particular environmental 
condition, e.g. high salinity. An enzyme encoded by a second polynucleotide from a different 
organism or variant may function effectively under a different environmental condition, such 

15 as extremely high temperatures. A hybrid polynucleotide containing sequences from the first 
and second original polynucleotides may encode an enzyme which exhibits characteristics of : 
both enzymes encoded by the original polynucleotides. Thus, the enzyme encoded by the 
hybrid polynucleotide may function effectively under environmental conditions shared by 
each of the enzymes encoded by the first and second polynucleotides, e.g., high salinity and 

20 extreme temperatures. 

Enzymes encoded by the polynucleotides of the invention include, but are not 
limited to, hydrolases, such as xylanases. Glycosidase hydrolases were first classified into 
families in 1991, see, e.g., Henrissat (1991) Biochem. J. 280:309-316. Since then, the 
classifications have been continually updated, see, e.g., Henrissat (1993) Biochem. J. 

25 293:781-788; Henrissat (1996) Biochem. J. 316:695-696; Henrissat (2000) Plant Physiology 
124: 1 5 1 5-1 5 1 9. There are 87 identified families of glycosidase hydrolases. In one aspect, 
the xylanases of the invention may be categorized in families 8, 10, 1 1, 26 and 30. In one 
aspect, the invention also provides xylanase-encoding nucleic acids with a common novelty 
in that they are derived from a common family, e.g., family 5, 6, 8, 10, 1 1, 26 or 30, as set 

30 forth in Table 5, below. 

Table 5 

SEQ ID FAMILY 

9,10 8 

1,2 8 
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5,6 


8 


7,8 


8 


99, 100 


10 


11,12 


10 


127, 128 


10 


27,28 


10 


97,98 


10 


45,46 


10 


141, 142 


10 


107, 108 


10 


129, 130 


10 


93,94 


10 


63, 64 


10 


25,26 


10 


49,50 


10 


67, 68 


10 


85,86 


10 


29,30 


10 


51,52 


10 


35,36 


10 


147, 148 


10 


119, 120 


10 


123, 124 


10 


249. 250 


10 


149, 150 


10 


83, 84 


10 


43,44 


10 


133,134 


10 


113,114 


10 


105. 106 


10 


75,76 


10 


111,112 


10 


117. 118 


10 


115.116 


10 


125, 126 


10 


137, 138 


10 


135. 136 


10 


69. 70 


10 


89. 90 


10 


31,32 


10 


13,14 


10 


65,66 


10 


57,58 


10 


77,78 


10 


73,74 


10 


109, 110 


10 


59, 60 


10 


71,72 


10 


139. 140 


10 


55,56 


10 


15,16 


10 
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131, 132 


10 


95, 96 


10 


101, 102 


10 


39,40 


10 


143, 144 


10 


103, 104 


10 


17, 18 


10 


53,54 


10 


21,22 


10 


151, 152 


10 


23,24 


10 


121, 122 


10 


41,42 


10 


47,48 


10 


247, 248 


10 


33,34 


10 


19, 20 


10 


87,88 


10 


81,82 


10 


91,92 


10 


61,62 


10 


37,38 


10 


79,80 


10 


231,232 


11 


157, 158 


11 


189, 190 


11 


167, 168 


11 


207, 208 


11 


251,252 


11 


213,214 


11 


177, 178 


11 


187, 188 


11 


205, 206 


11 


211,212 


11 


197, 198 


11 


209, 210 


11 


185, 186 


11 


229, 230 


11 


223, 224 


11 


179,180 


11 


193,194 


11 


173, 174 


11 


217, 218 


11 


153, 154 


11 


219,220 


11 


183, 184 


11 


253, 254 


11 


199, 200 


11 


255, 256 


11 


155. 156 


11 


169. 170 


11 
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195, 196 


11 


215, 216 


11 


191, 192 


11 


175, 176 


11 


161, 162 


11 


221,222 


11 


225, 226 


11 


163, 164 


11 


159, 160 


11 


233, 234 


11 


171. 172 


11 


203, 204 


11 


181, 182 


11 


227, 228 


11 


165, 166 


11 


257. 258 


26 


237, 238 


30 


241,242 


30 


239, 240 


30 


245, 246 


30 


235, 236 


30 


313, 314 


30 


345,346 


10 


321, 322 


10 


323, 324 


10 


315, 316 


10 


201, 202 


10 


265, 266 


10 


145, 146 


10 


287, 288 


10 


293, 294 


10 


351, 352 


10 


311,312 


10 


279, 280 


10 


289, 290 


10 


283. 284 


10 


373, 374 


10 


337. 338 


10 


371. 372 


10 


291.292 


10 


3.4 


10 


307, 308 


10 


343,344 


10 


349, 350 


10 


329. 330 


10 


355. 356 


id 


339. 340 


10 


295, 296 


10 


333, 334 


10 


281,282 


10 


361, 362 


10 
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347, 348 


10 


319, 320 


10 


357, 358 


10 


365, 366 


10 


273, 274 


10 


277, 278 


10 


271,272 


10 


285, 286 


10 


259, 260 


10 


325, 326 


10 


331.332 


10 


359. 360 


10 


303. 304 


10 


363. 364 


10 


305. 306 


10 


341,342 


10 


375. 376 


11 


377. 378 


11 


379, 380 


11 


301,302 


11 


309, 310 


11 


263, 264 


11 


269, 270 


11 


353, 354 


11 


299, 300 


11 


367. 368 


11 


261,262 


11 


369. 370 


11 


267, 268 


11 


317,318 


11 


297, 298 


11 


327, 328 


5 


275, 276 


6 



A hybrid polypeptide resulting from the method of the invention may exhibit 
specialized enzyme activity not displayed in the original enzymes. For example, following 
recombination and/or reductive reassortment of polynucleotides encoding hydrolase 
activities, the resulting hybrid polypeptide encoded by a hybrid polynucleotide can be 
screened for specialized hydrolase activities obtained from each of the original enzymes, i.e. 
the type of bond on which the hydrolase acts and the temperature at which the hydrolase 
functions. Thus, for example, the hydrolase may be screened to ascertain those chemical 
functionalities which distinguish the hybrid hydrolase from the original hydrolases, such as: 
(a) amide (peptide bonds), i.e., xylanases; (b) ester bonds, i.e., esterases and lipases; (c) 
acetals, i.e., glycosidases and, for example, the temperature, pH or salt concentration at which 
the hybrid polypeptide functions. 
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Sources of the original polynucleotides may be isolated from individual 
organisms ("isolates"), collections of organisms that have been grown in defined media 
("enrichment cultures"), or, uncultivated organisms ("environmental samples"). The use of a 
culture-independent approach to derive polynucleotides encoding novel bioactivities from 
5 environmental samples is most preferable since it allows one to access untapped resources of 
biodiversity. 

"Environmental libraries" are generated from environmental samples and 
represent the collective genomes of naturally occurring organisms archived in cloning vectors 
that can be propagated in suitable prokaryotic hosts. Because the cloned DNA is initially 

10 extracted directly from environmental samples, the libraries are not limited to the small 
fraction of prokaryotes that can be grown in pure culture. Additionally, a normalization of 
the environmental DNA present in these samples could allow more equal representation of 
the DNA from all of the species present in the original sample. This can dramatically 
increase the efficiency of finding interesting genes from minor constituents of the sample 

1 5 which may be under-represented by several orders of magnitude compared to the dominant 
species. 

For example, gene libraries generated from one or more uncultivated 
microorganisms are screened for an activity of interest. Potential pathways encoding 
bioactive molecules of interest are first captured in prokaryotic cells in the form of gene 

20 expression libraries. Polynucleotides encoding activities of interest are isolated from such 
libraries and introduced into a host cell. The host cell is grown under conditions which 
promote recombination and/or reductive reassortment creating potentially active 
biomolecules with novel or enhanced activities. 

Additionally, subcloning may be performed to further isolate sequences of 

25 interest. In subcloning, a portion of DNA is amplified, digested, generally by restriction 
enzymes, to cut out the desired sequence, the desired sequence is ligated into a recipient 
vector and is amplified. At each step in subcloning, the portion is examined for the activity 
of interest, in order to ensure that DNA that encodes the structural protein has not been 
excluded. The insert may be purified at any step of the subcloning, for example, by gel 

30 electrophoresis prior to ligation into a vector or where cells containing the recipient vector 
and cells not containing the recipient vector are placed on selective media containing, for 
example, an antibiotic, which will kill the cells not containing the recipient vector. Specific 
methods of subcloning cDNA inserts into vectors are well-known in the art (Sambrook et al., 
Molecular Cloning: A Laboratory Manual 2nd Ed., Cold Spring Harbor Laboratory Press 
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(1989)), In another aspect, the enzymes of the invention are subclones. Such subclones may 
differ from the parent clone by, for example, length, a mutation, a tag or a label. 

In one aspect, the signal sequences of the invention are identified following 
identification of novel xylanase polypeptides. The pathways by which proteins are sorted and 
5 transported to their proper cellular location are often referred to as protein targeting 
pathways. One of the most important elements in all of these targeting systems is a short 
amino acid sequence at the amino terminus of a newly synthesized polypeptide called the 
signal sequence. This signal sequence directs a protein to its appropriate location in the cell 
and is removed during transport or when the protein reaches its final destination. Most 

10 lysosomal, membrane, or secreted proteins have an amino-tenninal signal sequence that 
marks them for translocation into the lumen of the endoplasmic reticulum. More than 100 
signal sequences for proteins in this group have been determined. The sequences vary in 
length from 13 to 36 amino acid residues. Various methods of recognition of signal 
sequences are known to those of skill in the art. In one aspect, the peptides are identified by a 

15 method referred to as SignalP. SignalP uses a combined neural network which recognizes 
both signal peptides and their cleavage sites. See, e.g., Nielsen (1997) "Identification of 
prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites." Protein 
Engineering, vol. 10, no. 1, p. 1-6. It should be understood that some of the xylanases of the 
invention may or may not contain signal sequences. It may be desirable to include a nucleic 

20 acid sequence encoding a signal sequence from one xylanase operably linked to a nucleic 
acid sequence of a different xylanase or, optionally, a signal sequence from a non-xylanase 
protein may be desired. 

The microorganisms from which the polynucleotide may be prepared include 
prokaryotic microorganisms, such as Eubacteria m&Archaebacteria and lower eukaryotic 

25 microorganisms such as fungi, some algae and protozoa. Polynucleotides may be isolated 
from environmental samples in which case the nucleic acid may be recovered without 
culturing of an organism or recovered from one or more cultured organisms. In one aspect, 
such microorganisms may be extremophiles, such as hyperthermophiles, psychrophiles, 
psychrotrophs, halophiles, barophiles and acidophiles. Polynucleotides encoding enzymes 

30 isolated from extremophilic microorganisms can be used. Such enzymes may function at 
temperatures above 100°C in terrestrial hot springs and deep sea thermal vents, at 
temperatures below 0°C in arctic waters, in the saturated salt environment of the Dead Sea, at 
pH values around 0 in coal deposits and geothermal sulfur-rich springs, or at pH values 
greater than 1 1 in sewage sludge. For example, several esterases and lipases cloned and 
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expressed from extremophilic organisms show high activity throughout a wide range of 
temperatures and pHs. 

Polynucleotides selected and isolated as hereinabove described are introduced 
into a suitable host cell. A suitable host cell is any ceil which is capable of promoting 
5 recombination and/or reductive reassortment. The selected polynucleotides are preferably 
already in a vector which includes appropriate control sequences. The host cell can be a 
higher eukaryotic cell, such as a mammalian cell, or a lower eukaryotic cell, such as a yeast 
cell, or preferably, the host cell can be a prokaryotic cell, such as a bacterial cell. Introduction 
of the construct into the host cell can be effected by calcium phosphate transfection, DEAE- 

1 0 Dextran mediated transfection, or electroporation (Davis el al , 1 986). 

As representative examples of appropriate hosts, there may be mentioned: 
bacterial cells; such as E. coli> Streptomyces, Salmonella typhimurium; fungal cells, such as 
yeast; insect cells such as Drosophila S2 and Spodoptera Sf?; animal cells such as CHO, COS 
or Bowes melanoma; adenoviruses; and plant cells. The selection of an appropriate host is 

1 5 deemed to be within the scope of those skilled in the art from the teachings herein. 

With particular references to various mammalian cell culture systems that can 
be employed to express recombinant protein, examples of mammalian expression systems 
include the COS-7 lines of monkey kidney fibroblasts, described in "SV40-transformed 
simian cells support the replication of early SV40 mutants" (Gluzman, 1981) and other cell 

20 lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and 
. BHK cell lines. Mammalian expression vectors will comprise an origin of replication, a 
suitable promoter and enhancer and also any necessary ribosome binding sites, 
polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences 
and 5 r flanking nontranscribed sequences. DNA sequences derived from the SV40 splice and 

25 polyadenylation sites may be used to provide the required nontranscribed genetic elements. 

In another aspect, it is envisioned the method of the present invention can be 
used to generate novel polynucleotides encoding biochemical pathways from one or more 
operons or gene clusters or portions thereof For example, bacteria and many eukaryotes 
have a coordinated mechanism for regulating genes whose products are involved in related 

30 processes. The genes are clustered, in structures referred to as "gene clusters," on a single 
chromosome and are transcribed together under the control of a single regulatory sequence, 
including a single promoter which initiates transcription of the entire cluster. Thus, a gene 
cluster is a group of adjacent genes that are either identical or related, usually as to their 
function. An example of a biochemical pathway encoded by gene clusters are polyketides. 
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Gene cluster DNA can be isolated from different organisms and ligated into 
vectors, particularly vectors containing expression regulatory sequences which can control 
and regulate the production of a detectable protein or protein-related array activity from the 
ligated gene clusters. Use of vectors which have an exceptionally large capacity for 
5 exogenous DNA introduction are particularly appropriate for use with such gene clusters and 
are described by way of example herein to include the f-factor (or fertility factor) of E. coli. 
This f-factor of E. coli is a plasmid which affects high-frequency transfer of itself during 
conjugation and is ideal to achieve and stably propagate large DNA fragments, such as gene 
clusters from mixed microbial samples. One aspect of the invention is to use cloning vectors, 

10 referred to as "fosmids" or bacterial artificial chromosome (BAC) vectors. These are derived 
from £. coli f-factor which is able to stably integrate large segments of genomic DNA. When 
integrated with DNA from a mixed uncultured environmental sample, this makes it possible 
to achieve large genomic fragments in the form of a stable "environmental DNA library." 
Another type of vector for use in the present invention is a cosmid vector, Cosmid vectors 

1 5 were originally designed to clone and propagate large segments of genomic DNA. Cloning 
into cosmid vectors is described in detail in Sambrook et al^ Molecular Cloning: A 
Laboratory Manual 2nd Ed., Cold Spring Harbor Laboratory Press (1989). Once ligated into 
an appropriate vector, two or more vectors containing different polyketide synthase gene 
clusters can be introduced into a suitable host cell. Regions of partial sequence homology 

20 shared by the gene clusters will promote processes which result in sequence reorganization 
resulting in a hybrid gene cluster. The novel hybrid gene cluster can then be screened for 
enhanced activities not found in the original gene clusters. 

Therefore, in a one aspect, the invention relates to a method for producing a 
biologically active hybrid polypeptide and screening such a polypeptide for enhanced activity 

25 by: 

1) introducing at least a first polynucleotide in operable linkage and a second 
polynucleotide in operable linkage, the at least first polynucleotide and second 
polynucleotide sharing at least one region of partial sequence homology, into a 
suitable host cell; 

30 2) growing the host cell under conditions which promote sequence reorganization 

resulting in a hybrid polynucleotide in operable linkage; 

3) expressing a hybrid polypeptide encoded by the hybrid polynucleotide; 

4) screening the hybrid polypeptide under conditions which promote identification 
of enhanced biological activity; and 
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5) isolating the a polynucleotide encoding the hybrid polypeptide. 

Methods for screening for various enzyme activities are known to those of 
skill in the art and are discussed throughout the present specification. Such methods may be 
employed when isolating the polypeptides and polynucleotides of the invention. 

Screening Methodologies and "On-line" Monitoring Devices 

In practicing the methods of the invention, a variety of apparatus and 
methodologies can be used to in conjunction with the polypeptides and nucleic acids of the 
invention, e.g., to screen polypeptides for xylanase activity (e.g., assays such as hydrolysis of 
casein in zymograms, the release of fluorescence from gelatin, or the release of p-nitroanalide 
from various small peptide substrates), to screen compounds as potential modulators, e.g., 
activators or inhibitors, of a xylanase activity, for antibodies that bind to a polypeptide of the 
invention, for nucleic acids that hybridize to a nucleic acid of the invention, to screen for cells 
expressing a polypeptide of the invention and the like. In addition to the array formats 
described in detail below for screening samples, alternative formats can also be used to 
practice the methods of the invention Such formats include, for. example, mass 
spectrometers, chromatographs, e.g., high-throughput HPLC and other forms of liquid 
chromatography, and smaller formats, such as 1536-well plates, 384-well plates and so on. 
High throughput screening apparatus can be adapted and used to practice the methods of the 
invention, see, e.g., U.S. Patent Application No. 20020001809. 

Capillary Arrays 

Nucleic acids or polypeptides of the invention can be immobilized to or 
applied to an array. Arrays can be used to screen for or monitor libraries of compositions 
(e.g., small molecules, antibodies, nucleic acids, etc.) for their ability to bind to or modulate 
the activity of a nucleic acid or a polypeptide of the invention. Capillary arrays, such as the 
GIGAMATRIX™, Diversa Corporation, San Diego, CA; and arrays described in, e.g., U.S. 
Patent Application No. 20020080350 Al ; WO 023 1203 A; WO 0244336 A, provide an 
alternative apparatus for holding and screening samples. In one aspect, the capillary array 
includes a plurality of capillaries formed into an array of adjacent capillaries, wherein each 
capillary comprises at least one wall defining a lumen for retaining a sample. The lumen may 
be cylindrical, square, hexagonal or any other geometric shape so long as the walls form a 
lumen for retention of a liquid or sample. The capillaries of the capillary array can be held 
together in close proximity to form a planar structure. The capillaries can be bound together, 
by being fused (e.g., where the capillaries are made of glass), glued, bonded, or clamped side- 
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by-side. Additionally, the capillary array can include interstitial material disposed between 
adjacent capillaries in the array, thereby forming a solid planar device containing a plurality 
of through-holes. 

A capillary array can be formed of any number of individual capillaries, for 
example, a range from 1 00 to 4,000,000 capillaries. Further, a capillary array having about 
100,000 or more individual capillaries can be formed into the standard size and shape of a 
Microtiter® plate for fitment into standard laboratory equipment. The lumens are filled 
manually or automatically using either capillary action or microinjection using a thin needle. 
Samples of interest may subsequently be removed from individual capillaries for further 
analysis or characterization. For example, a thin, needle-like probe is positioned in fluid 
communication with a selected capillary to either add or withdraw material from the lumen. 

In a single-pot screening assay, the assay components are mixed yielding a 
solution of interest, prior to insertion into the capillary array. The lumen is filled by capillary 
action when at least a portion of the array is immersed into a solution of interest. Chemical 
or biological reactions and/or activity in each capillary are monitored for detectable events. 
A detectable event is often referred to as a "hit", which can usually be distinguished from 
"non-hit" producing capillaries by optical detection. Thus, capillary arrays allow for 
massively parallel detection of "hits". 

In a multi-pot screening assay, a polypeptide or nucleic acid, e.g., a ligand, can 
be introduced into a first component, which is introduced into at least a portion of a capillary 
of a capillary array. An air bubble can then be introduced into die capillary behind the first 
component. A second component can then be introduced into the capillary, wherein the 
second component is separated from the first component by the air bubble. The first and 
second components can then be mixed by applying hydrostatic pressure to both sides of the 
capillary array to collapse the bubble. The capillary array is then monitored for a detectable 
event resulting from reaction or non-reaction of the two components. 

hi a binding screening assay, a sample of interest can be introduced as a first 
liquid labeled with a detectable particle into a capillary of a capillary array, wherein the 
lumen of the capillary is coated with a binding material for binding the detectable particle to 
the lumen. The first liquid may then be removed from the capillary tube, wherein the bound 
detectable particle is maintained within the capillary, and a second liquid may be introduced 
into the capillary tube. The capillary is then monitored for a detectable event resulting from 
reaction or non-reaction of the particle with the second liquid. 
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Arrays, or "Biochips" 

Nucleic acids or polypeptides of the invention can be immobilized to or 
applied to an array. Arrays can be used to screen for or monitor libraries of compositions 
(e.g., small molecules, antibodies, nucleic acids, etc.) for their ability to bind to or modulate 

5 the activity of a nucleic acid or a polypeptide of the invention. For example, in one aspect of 
the invention, a monitored parameter is transcript expression of a xylanase gene. One or 
more, or, all the transcripts of a cell can be measured by hybridization of a sample comprising 
transcripts of the cell, or, nucleic acids representative of or complementary to transcripts of a 
cell, by hybridization to immobilized nucleic acids on an array, or "biochip." By using an 

10 "array" of nucleic acids on a microchip, some or all of the transcripts of a cell can be 

simultaneously quantified. Alternatively, arrays comprising genomic nucleic acid can also be 
used to determine the genotype of a newly engineered strain made by the methods of the 
invention. Polypeptide arrays" can also be used to simultaneously quantify a plurality of 
proteins. The present invention can be practiced with any known "array," also referred to as 

15 a ''microarray" or "nucleic acid array" or "polypeptide array" or "antibody aiTay" or 
"biochip ," or variation thereof Arrays are generically a plurality of "spots" or "target 
elements," each target element comprising a defined amount of one or more biological 
molecules, e.g., oligonucleotides, immobilized onto a defined area of a substrate surface for 
specific binding to a sample molecule, e.g., mRNA transcripts. 

20 In practicing the methods of the invention, any known array and/or method of : 

making and using arrays can be incorporated in whole or in part, or variations thereof, as 
described, for example, in U.S. Patent Nos. 6,277,628; 6,277,489; 6,261,776; 6,258,606; 
6,054,270; 6,048,695; 6,045,996; 6,022,963; 6,013,440; 5,965,452; 5,959,098; 5,856,174; 
5,830,645; 5,770,456; 5,632,957; 5,556,752; 5,143,854; 5,807,522; 5,800,992; 5,744,305; 

25 5,700,637; 5,556,752; 5,434,049; see also, e.g., WO 99/51773; WO 99/09217; WO 97/46313; 
WO 96/17958; see also, e.g., Johnston (1998) Curr. Biol. 8:R171-R174; Schummer (1997) 
Biotechniques 23:1087-1092; Kern (1997) Biotechniques 23:120-124; Solinas-Toldo (1997) 
Genes, Chromosomes & Cancer 20:399-407; Bowtell (1999) Nature Genetics Supp. 21:25- 
32. See also published U.S. patent applications Nos. 20010018642; 20010019827; 

30 20010016322; 20010014449; 20010014448; 20010012537; 20010008765. 

Antibodies and Antibody-based screening methods 

The invention provides isolated or recombinant antibodies that specifically 
bind to a xylanase of the invention. These antibodies can be used to isolate, identify or 
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quantify the xylanases of the invention or related polypeptides. These antibodies can be used 
to isolate other polypeptides within the scope the invention or other related xylanases. The 
antibodies can be designed to bind to an active site of a xylanase. Thus, the invention 
provides methods of inhibiting xylanases using the antibodies of the invention (see discussion 
5 above regarding applications for anti-xylanase compositions of the invention). 

The invention provides fragments of the enzymes of the invention, including 
immunogenic fragments of a polypeptide of the invention. The invention provides 
compositions comprising a polypeptide or peptide of the invention and adjuvants or carriers 
and the like. 

10 The antibodies can be used in immunoprecipitation, staining, immunoaffinity 

columns, and the like. If desired, nucleic acid sequences encoding for specific antigens can 
be generated by immunization followed by isolation of polypeptide or nucleic acid, 
amplification or cloning and immobilization of polypeptide onto an array of the invention. 
Alternatively, the methods of the invention can be used to modify the structure of an antibody 

15 produced by a cell to be modified, e.g., an antibody's affinity can be increased or decreased. 
Furthermore, the ability to make or modify antibodies can be a phenotype engineered into a 
cell by the methods of the invention. 

Methods of immunization, producing and isolating antibodies (polyclonal and 
monoclonal) are known to those of skill in the art and described in the scientific and patent 

20 literature, see, e.g., Coligan, CURRENT PROTOCOLS IN IMMUNOLOGY, Wiley/Greene, 
NY (1991); Stites (eds.) BASIC AND CLINICAL IMMUNOLOGY (7th ed.) Lange Medical 
Publications, Los Altos, CA ("Stites"); Goding, MONOCLONAL ANTIBODIES: 
PRINCIPLES AND PRACTICE (2d ed.) Academic Press, New York, NY (1986); Kohler 
(1975) Nature 256:495; Harlow (1988) ANTIBODIES, A LABORATORY MANUAL, Cold 

25 Spring Harbor Publications, New York. Antibodies also can be generated in vitro, e.g., using 
recombinant antibody binding site expressing phage display libraries, in addition to the 
traditional in vivo methods using animals. See, e.g., Hoogenboom (1997) Trends Biotechnol. 
15:62-70; Katz (1997) Annu. Rev. Biophys. Biomol Struct. 26:27-45. 

The polypeptides of Group B amino acid sequences and sequences 

30 substantially identical thereto or fragments comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 
75, 100, or 150 consecutive amino acids thereof, may also be used to generate antibodies 
which bind specifically to the polypeptides or fragments. The resulting antibodies may be 
used in immunoaffinity chromatography procedures to isolate or purify the polypeptide or to 
determine whether the polypeptide is present in a biological sample. In such procedures, a 
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protein preparation, such as an extract, or a biological sample is contacted with an antibody 
capable of specifically binding to one of the polypeptides of Group B amino acid sequences 
and sequences substantially identical thereto, or fragments comprising at least 5, 10, 15, 20, 
25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof. 

5 In immunoaffinity procedures, the antibody is attached to a solid support, such 

as a bead or other column matrix. The protein preparation is placed in contact with the 
antibody under conditions in which the antibody specifically binds to one of the polypeptides 
of Group B amino acid sequences and sequences substantially identical thereto, or fragment 
thereof. After a wash to remove non-specifically bound proteins, the specifically bound 

1 0 polypeptides are eluted. 

The ability of proteins in a biological sample to bind to the antibody may be 
determined using any of a variety of procedures familiar to those skilled in the art. For 
example, binding may be determined by labeling the antibody with a detectable label such as 
a fluorescent agent, an enzymatic label, or a radioisotope. Alternatively, binding of the 

1 5 antibody to the sample may be detected using a secondary antibody having such a detectable 
label thereon. Particular assays include ELISA assays, sandwich assays, radioimmunoassays 
and Western Blots. 

Polyclonal antibodies generated against the polypeptides of Group B amino 
acid sequences and sequences substantially identical thereto, or fragments comprising at least 

20 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof can be 
obtained by direct injection of the polypeptides into an animal or by administering the 
polypeptides to an animal, for example, a nonhuman. The antibody so obtained will then 
bind the polypeptide itself. In this manner, even a sequence encoding only a fragment of the 
polypeptide can be used to generate antibodies which may bind to the whole native 

25 polypeptide. Such antibodies can then be used to isolate the polypeptide from cells 
expressing that polypeptide. 

For preparation of monoclonal antibodies, any technique which provides 
antibodies produced by continuous cell line cultures can be used. Examples include the 
hybridoma technique (Kohler and Milstein, Nature, 256:495-497, 1975), the trioma 

3 0 technique, the human B-cell hybridoma technique (Kozbor et al , Immunology Today 4:72, 
1983) and the EBV-hybridoma technique (Cole, et ai 9 1985, in Monoclonal Antibodies and 
Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. 
Patent No. 4,946,778) can be adapted to produce single chain antibodies to the polypeptides 
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of Group B amino acid sequences and sequences substantially identical thereto, or fragments 
comprising at least 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids 
thereof. Alternatively, transgenic mice may be used to express humanized antibodies to these 
polypeptides or fragments thereof. 
5 Antibodies generated against the polypeptides of Group B amino acid 

sequences and sequences substantially identical thereto, or fragments comprising at least 5, 
10, 15, 20, 25, 30, 35, 40, 50, 75, 100, or 150 consecutive amino acids thereof may be used in 
screening for similar polypeptides from other organisms and samples. In such techniques, 
polypeptides from the organism are contacted with the antibody and those polypeptides which 
10 specifically bind the antibody are detected. Any of the procedures described above may be 
used to detect antibody binding. One such screening assay is described in "Methods for 
Measuring Cellulase Activities", Methods in Enzymology, Vol 160, pp. 87-1 16. 

Kits 

The invention provides kits comprising the compositions, e.g., nucleic acids, 
1 5 expression cassettes, vectors, cells, transgenic seeds or plants or plant parts, polypeptides 
(e.g., xylanases) and/or antibodies of the invention. The kits also can contain instructional 
material teaching the methodologies and industrial uses of the invention, as described herein. 

Whole cell engi neering and measuring metabolic parameters 

The methods of the invention provide whole cell evolution, or whole cell 

20 engineering, of a cell to develop a new cell strain having a new phenotype, e.g., a new or 
modified xylanase a ctivity, by modifying the genetic composition of the cell. The genetic 
composition can be modified by addition to the cell of a nucleic acid of the invention, e.g., a 
coding sequence for an enzyme of the invention. See, e.g., WO0229032; WO0196551. 

To detect the new phenotype, at least one metabolic parameter of a modified 

25 cell is monitored in the cell in a 'teal time" or "on-line" time frame. In one aspect, a plurality 
of cells, such as a cell culture, is monitored in "real time" or "on-line." In one aspect, a 
plurality of metabolic parameters is monitored in "real time" or "on-line." Metabolic 
parameters can be monitored using the xylanases of the invention. 

Metabolic flux analysis (MFA) is based on a known biochemistry framework. 

30 A linearly independent metabolic matrix is constructed based on the law of mass 
conservation and on the pseudo-steady state hypothesis (PSSH) on the intracellular 
metabolites. In practicing the methods of the invention, metabolic networks are established, 
including the: 
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• identity of all pathway substrates, products and intermediary metabolites 

• identity of all the chemical reactions interconverting the pathway metabolites, the 
stoichiometry of the pathway reactions, 

• identity of all the enzymes catalyzing the reactions, the enzyme reaction kinetics, 

5 • the regulatory interactions between pathway components, e.g. allosteric interactions, 

enzyme-enzyme interactions etc, 

• intracellular compartmentalization of enzymes or any other supramolecular 
organization of the enzymes, and, 

• the presence of any concentration gradients of metabolites, enzymes or effector 
10 molecules or diffusion barriers to their movement. 

Once the metabolic network for a given strain is built, mathematic 
presentation by matrix notion can be introduced to estimate the intracellular metabolic fluxes 
if the on-line metabolome data is available. Metabolic phenotype relies on the changes of the 
whole metabolic network within a cell. Metabolic phenotype relies on the change of pathway 

1 5 utilization with respect to environmental conditions, genetic regulation, developmental state 
and the genotype, etc. In one aspect of the methods of the invention, after the on-line MFA 
calculation, the dynamic behavior of the cells, their phenotype and other properties are 
analyzed by investigating the pathway utilization. For example, if the glucose supply is 
increased and the oxygen decreased during the yeast fermentation, the utilization of 

20 respiratory pathways will be reduced and/or stopped, and the utilization of the fermentative 
pathways will dominate. Control of physiological state of cell cultures will become possible 
after the pathway analysis. The methods of the invention can help determine how to 
manipulate the fermentation by determining how to change the substrate supply, temperature, 
use of inducers, etc. to control the physiological state of cells to move along desirable 

25 direction. In practicing the methods of the invention, the MFA results can also be compared 
with transcriptome and proteome data to design experiments and protocols for metabolic 
engineering or gene shuffling, etc. 

In practicing the methods of the invention, any modified or new phenotype 
can be conferred and detected, including new or improved characteristics in the cell. Any 

30 aspect of metabolism or growth can be monitored. 

Monitoring expression of an mRNA transcript 

In one aspect of the invention, the engineered phenotype comprises increasing 
or decreasing the expression of an mRNA transcript (e.g., a xylanase message) or generating 
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new (e.g., xylanase) transcripts in a cell. This increased or decreased expression can be 
traced by testing for the presence of a xylanase of the invention or by xylanase activity 
assays. mRNA transcripts, or messages, also can be detected and quantified by any method 
known in the art, including, e.g., Northern blots, quantitative amplification reactions, 
5 hybridization to arrays, and the like. Quantitative amplification reactions include, e.g., 
quantitative PCR, including, e.g., quantitative reverse transcription polymerase chain 
reaction, or RT-PCR; quantitative real time RT-PCR, or "real-time kinetic RT-PCR" (see, 
e.g., Kreuzer (2001) Br. J. Haematol. 114:313-318; Xia (2001) Transplantation 72:907-914). 
In one aspect of the invention, the engineered phenotype is generated by 

1 0 knocking out expression of a homologous gene. The gene's coding sequence or one or more 
transcriptional control elements can be knocked out, e.g., promoters or enhancers. Thus, the 
expression of a transcript can be completely ablated or only decreased. 

In one aspect of Hie invention, the engineered phenotype comprises increasing 
the expression of a homologous gene. This can be effected by knocking out of a negative 

15 control element, including a transcriptional regulatory element acting in cis- or trans- , or, 
mutagenizing a positive control element One or more, or, all the transcripts of a cell can be 
measured by hybridization of a sample comprising transcripts of the cell, or, nucleic acids 
representative of or complementary to transcripts of a cell, by hybridization to immobilized 
nucleic acids on an array. 

20 Monitoring expression of a polypeptides, peptides and amino acids 

In one aspect of the invention, the engineered phenotype comprises increasing 
or decreasing the expression of a polypeptide (e.g., a xylanase) or generating new 
polypeptides in a ceil. This increased or decreased expression can be traced by determining 
the amount of xylanase present or by xylanase activity assays. Polypeptides, peptides and 

25 amino acids also can be detected and quantified by any method known in the art, including, 
e.g., nuclear magnetic resonance (NMR), spectrophotometry, radiography (protein 
radiolabeling), electrophoresis, capillary electrophoresis, high performance liquid 
chromatography (HPLC), thin layer chromatography (TLC), hyperdifiusion chromatography, 
various immunological methods, e.g. immunoprecipitation, immunodiffusion, immuno- 

30 electrophoresis, radioimmunoassays (RIAs), enzyme-linked immunosorbent assays 

(ELISAs), immuno-fluorescent assays, gel electrophoresis (e.g., SDS-PAGE), staining with 
antibodies, fluorescent activated cell sorter (FACS), pyrolysis mass spectrometry, Fourier- 
Transform Infrared Spectrometry, Raman spectrometry, GC-MS, and LC-Electrospray and 
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cap-LC-tandem-electrospray mass spectrometries, and the like. Novel bioactivities can also 
be screened using methods, or variations thereof, described in U.S. Patent No. 6,057,103. 
Furthermore, as discussed below in detail, one or more, or, all the polypeptides of a cell can 
be measured using a protein array. 

5 Industrial Applications 

The xylanase enzymes of the invention can be highly selective catalysts. They 
can catalyze reactions with exquisite stereo-, regio- and chemo- selectivities that are 
unparalleled in conventional synthetic chemistry. Moreover, enzymes are remarkably 
versatile. The xylanase enzymes of the invention can be tailored to function in organic 

10 solvents, operate at extreme pHs (for example, high pHs and low pHs) extreme temperatures 
(for example, high temperatures and low temperatures), extreme salinity levels (for example, 
high salinity and low salinity) and catalyze reactions with compounds that are structurally 
unrelated to their natural, physiological substrates. 

Detergent Compositions 

15 The invention provides detergent compositions comprising one or more 

polypeptides (e.g., xylanases) of the invention, and methods of making and using these 
compositions. The invention incorporates all methods of making and using detergent 
compositions, see, e.g., U.S. Patent No. 6,413,928; 6,399,561; 6,365,561; 6,380,147. The 
detergent compositions can be a one and two part aqueous composition, a non-aqueous liquid 

20 composition, a cast solid, a granular form, a particulate form, a compressed tablet, a gel 
and/or a paste and a slurry form. The xylanases of the invention can also be used as a 
detergent additive product in a solid or a liquid form. Such additive products are intended to 
supplement or boost the performance of conventional detergent compositions and can be 
added at any stage of the cleaning process. 

25 The actual active enzyme content depends upon the method of manufacture of 

a detergent composition and is not critical, assuming the detergent solution has the desired 
enzymatic activity. In one aspect, the amount of xylanase present in the final solution ranges 
from about 0.001 mg to 0.5 mg per gram of the detergent composition. The particular 
enzyme chosen for use in the process and products of this invention depends upon the 

30 conditions of final utility, including the physical product form, use pH, use temperature, and 
soil types to be degraded or altered. The enzyme can be chosen to provide optimum activity 
and stability for any given set of utility conditions. In one aspect, the xylanases of the present 
invention are active in the pH ranges of from about 4 to about 12 and in the temperature 
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range of from about 20°C to about 95°C. The detergents of the invention can comprise 
cationic, semi-polar nonionic or zwitterionic surfactants; or, mixtures thereof. 

Xylanases of the invention can be formulated into powdered and liquid 
detergents having pH between 4.0 and 12.0 at levels of about 0.01 to about 5% (preferably 
5 0.1% to 0.5%) by weight. These detergent compositions can also include other enzymes such 
as xylanases, cellulases, lipases or endoglycosidases, endo-beta.-l,4-glucanases, beta- 
glucanases, endo-beta-l,3(4)-glucanases, cutinases, peroxidases, laccases, amylases, 
glucoamylases, pectinases, reductases, oxidases, phenoloxidases, ligninases, pullulanases, 
arabinanases, hemicellulases, mannanases, xyloglucanases, xylanases, pectin acetyl esterases, 

1 0 rhamnogalacturonan acetyl esterases, polygalacturonases, rhamnogalacturonases, 
galactanases, pectin lyases, pectin methylesterases, cellobiohydrolases and/or 
transglutaminases. These detergent compositions can also include builders and stabilizers. 

The addition of xylanases of the invention to conventional cleaning 
compositions does not create any special use limitation. In other words, any temperature and 

15 pH suitable for the detergent is also suitable for the compositions of the invention as long as 
the enzyme is active at or tolerant of the pH and/or temperature of the intended use. In 
addition, the xylanases of the invention can be used in a cleaning composition without 
detergents, again either alone or in combination with builders and stabilizers. 

The present invention provides cleaning compositions including detergent 

20 compositions for cleaning hard surfaces, detergent compositions for cleaning fabrics, 

dishwashing compositions, oral cleaning compositions, denture cleaning compositions, and 
contact lens cleaning solutions. 

In one aspect, the invention provides a method for washing an object 
comprising contacting the object with a polypeptide of the invention under conditions 

25 sufficient for washing. A xylanase of the invention may be included as a detergent additive. 
The detergent composition of the invention may, for example, be formulated as a hand or 
machine laundry detergent composition comprising a polypeptide of the invention. A 
laundry additive suitable for pre-treatment of stained fabrics can comprise a polypeptide of 
the invention. A fabric softener composition can comprise a xylanase of the invention. 

30 Alternatively, a xylanase of the invention can be formulated as a detergent composition for 
use in general household hard surface cleaning operations. In alternative aspects, detergent 
additives and detergent compositions of the invention may comprise one or more other 
enzymes such as a xylanase, a lipase, a cutinase, another xylanase, a carbohydrase, a 
cellulase, a pectinase, a mannanase, an arabinase, a galactanase, a xylanase, an oxidase, e.g., 
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a lactase, and/or a peroxidase (see also, above). The properties of the enzyme(s) of the 
invention are chosen to be compatible with the selected detergent (i.e. pH-optimum, 
compatibility with other enzymatic and non-enzymatic ingredients, etc.) and the enzyme(s) is 
present in effective amounts. In one aspect, xylanase enzymes of the invention are used to 
5 remove malodorous materials from fabrics. Various detergent compositions and methods for 
making them that can be used in practicing the invention are described in, e.g., U.S. Patent 
Nos. 6,333,301; 6,329,333; 6,326,341; 6,297,038; 6,309,871; 6,204,232; 6,197,070; 
5,856,164. 

When formulated as compositions suitable for use in a laundry machine 

1 0 washing method, the xylanases of the invention can comprise both a surfactant and a builder 
compound. They can additionally comprise one or more detergent components, e.g., organic 
polymeric compounds, bleaching agents, additional enzymes, suds suppressors, dispersants, 
lime-soap dispersants, soil suspension and anti-redeposition agents and corrosion inhibitors. 
Laundry compositions of the invention can also contain softening agents, as additional 

15 detergent components. Such compositions containing carbohydrase can provide fabric 
cleaning, stain removal, whiteness maintenance, softening, color appearance, dye transfer 
inhibition and sanitization when formulated as laundry detergent compositions. 

The density of the laundry detergent compositions of the invention can range 
from about 200 to 1500 g/liter, or, about 400 to 1200 g/liter, or, about 500 to 950 g/liter, or, 

20 600 to 800 g/liter, of composition; this can be measured at about 20°C. 

The "compact" form of laundry detergent compositions of the invention is best 
reflected by density and, in terms of composition, by the amount of inorganic filler salt. 
Inorganic filler salts are conventional ingredients of detergent compositions in powder form. 
In conventional detergent compositions, the filler salts are present in substantial amounts, 

25 typically 17% to 35% by weight of the total composition. In one aspect of the compact 

compositions, the filler salt is present in amounts not exceeding 15% of the total composition, 
or, not exceeding 10%, or, not exceeding 5% by weight of the composition. The inorganic 
filler salts can be selected from the alkali and alkaline-earth-metal salts of sulphates and 
chlorides, e.g., sodium sulphate. 

30 Liquid detergent compositions of the invention can also be in a "concentrated 

form." In one aspect, the liquid detergent compositions can contain a lower amount of water, 
compared to conventional liquid detergents. In alternative aspects, the water content of the 
concentrated liquid detergent is less than 40%, or, less than 30%, or, less than 20% by weight 
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of the detergent composition. Detergent compounds of the invention can comprise 
formulations as described in WO 97/01629. 

Xylanases of the invention can be useful in formulating various cleaning 
compositions. A number of known compounds are suitable surfactants including nonionic, 
5 anionic, cationic, or zwitterionic detergents, can be used, e.g., as disclosed in U.S. Patent 
Nos. 4,404,128; 4,261,868; 5,204,015. In addition, xylanases can be used, for example, in 
bar or liquid soap applications, dish care formulations, contact lens cleaning solutions or 
products, peptide hydrolysis, waste treatment, textile applications, as fusion-cleavage 
enzymes in protein production, and the like. Xylanases may provide enhanced performance 

10 in a detergent composition as compared to another detergent xylanase, that is, the enzyme 
group may increase cleaning of certain enzyme sensitive stains such as grass or blood, as 
determined by usual evaluation after a standard wash cycle. Xylanases can be formulated 
into known powdered and liquid detergents having pH between 6.5 and 12.0 at levels of 
about 0.01 to about 5% (for example, about 0.1 % to 0.5%) by weight. These detergent 

1 5 cleaning compositions can also include other enzymes such as known xylanases, xylanases, 
amylases, cellulases, lipases or endoglycosidases, as well as builders and stabilizers. 

In one aspect, the invention provides detergent compositions having xylanase 
activity (a xylanase of the invention) for use with fruit, vegetables and/or mud and clay 
compounds (see, for example, U.S. Pat. No. 5,786,316). 

20 Treating fibers and textiles 

The invention provides methods of treating fibers and fabrics using one or 
more xylanases of the invention. The xylanases can be used in any fiber- or fabric-treating 
method, which are well known in the art, see, e.g., U.S. Patent No. 6,261,828; 6,077,3 16; 
6,024,766; 6,021,536; 6,017,751; 5,980,581; US Patent Publication No. 20020142438 Al. 

25 For example, xylanases of the invention can be used in fiber and/or fabric desizing. In one 
aspect, the feel and appearance of a fabric is improved by a method comprising contacting the 
fabric with a xylanase of the invention in a solution, hi one aspect, the fabric is treated with 
the solution under pressure. For example, xylanases of the invention can be used in the 
removal of stains. 

30 The xylanases of the invention can be used to treat any cellulosic material, 

including fibers (e.g., fibers from cotton, hemp, flax or linen), sewn and unsewn fabrics, e.g., 
knits, wovens, denims, yarns, and toweling, made from cotton, cotton blends or natural or 
manmade cellulosics (e.g. originating from xylan-containing cellulose fibers such as from 
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wood pulp) or blends thereof. Examples of blends are blends of cotton or rayon/viscose with 
one or more companion material such as wool, synthetic fibers (e.g. polyamide fibers, acrylic 
fibers, polyester fibers, polyvinyl alcohol fibers, polyvinyl chloride fibers, polyvinylidene 
chloride fibers, polyurethane fibers, polyurea fibers, aramid fibers), and cellulose-containing 
5 fibers (e.g. rayon/viscose, ramie, hemp, flax/linen, jute, cellulose acetate fibers, lyocell). 

The textile treating processes of the invention (using xylanases of the 
invention) can be used in conjunction with other textile treatments, e.g., scouring and 
bleaching. Scouring is the removal of non-cellulosic material from the cotton fiber, e.g., the 
cuticle (mainly consisting of waxes) and primary cell wall (mainly consisting of pectin, 

1 0 protein and xyloglucan). A proper wax removal is necessary for obtaining a high wettability. 
This is needed for dyeing. Removal of the primary cell walls by the processes of the 
invention improves wax removal and ensures a more even dyeing. Treating textiles with the 
processes of the invention can improve whiteness in the bleaching process. The main 
chemical used in scouring is sodium, hydroxide in high concentrations and at high 

15 temperatures. Bleaching comprises oxidizing the textile. Bleaching typically involves use of 
hydrogen peroxide as the oxidizing agent in order to obtain either a fully bleached (white) 
fabric or to ensure a clean shade of the dye. 

The invention also provides alkaline xylanases (xylanases active under 
alkaline conditions). These have wide-ranging applications in textile processing, degumming 

20 of plant fibers (e.g., plant bast fibers), treatment of pectic wastewaters, paper-making, and 
coffee and tea fermentations. See, e.g., Hoondal (2002) Applied Microbiology and 
Biotechnology 59:409-418. 

Treating foods and food processing 

The xylanases of the invention have numerous applications in food processing 
25 industry. For example, in one aspect, the xylanases of the invention are used to improve the 
extraction of oil from oil-rich plant material, e.g., oil-rich seeds, for example, soybean oil 
from soybeans, olive oil from olives, rapeseed oil from rapeseed and/or sunflower oil from 
sunflower seeds. 

The xylanases of the invention can be used for separation of components of 
30 plant cell materials. For example, xylanases of the invention can be used in the separation of 
xylan-rich material (e.g., plant cells) into components. In one aspect, xylanases of the 
invention can be used to separate xylan-rich or oil-rich crops into valuable protein and oil and 
hull fractions. The separation process may be performed by use of methods known in the art. 
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The xylanases of the invention can be used in the preparation of fruit or 
vegetable juices, syrups, extracts and the like to increase yield. The xylanases of the 
invention can be used in the enzymatic treatment (e.g., hydrolysis of xylan-comprising plant 
materials) of various plant cell wall-derived materials or waste materials, e.g. from cereals, 

5 grains, wine or juice production, or agricultural residues such as vegetable hulls, bean hulls, 
sugar beet pulp, olive pulp, potato pulp, and the like. The xylanases of the invention can be 
used to modify the consistency and appearance of processed fruit or vegetables. The 
xylanases of the invention can be used to treat plant material to facilitate processing of plant 
material, including foods, facilitate purification or extraction of plant components. The 

1 0 xylanases of the invention can be used to improve feed value, decrease the water binding 
capacity, improve the degradability in waste water plants and/or improve the conversion of 
plant material to ensilage, and the like. 

In one aspect, xylanases of the invention are used in baking applications, e.g., 
cookies and crackers, to hydrolyze arabinoxylans and create non-sticky doughs that are not 

1 5 difficult to machine and to reduce biscuit size. Use xylanases of the invention to hydrolyze 
arabinoxylans is used to prevent rapid rehydration of the baked product resulting in loss of 
crispiness and reduced shelf-life. In one aspect, xylanases of the invention are used as 
additives in dough processing. In one aspect, xylanases of the invention are used in dough 
conditioning, wherein in one aspect the xylanases possess high activity over a temperature 

20 range of about 25-35°C and at near neutral pH (7.0 - 7.5). In one aspect, dough conditioning 
enzymes can be inactivated at the extreme temperatures of baking (>500°F). 

In one aspect, xylanases of the invention are used as additives in dough 
processing to perform optimally under dough pH and temperature conditions, hi one aspect, 
an enzyme of the invention is used for dough conditioning. In one aspect, a xylanase of the 

25 invention possesses high activity over a temperature range of 25-35°C and at near neutral pH 
(7.0 - 7.5). In one aspect, the enzyme is inactivated at the extreme temperatures of baking, 
for example, >500°R 

Paper or pulp treatment 

The xylanases of the invention can be in paper or pulp treatment or paper 
30 deinking. For example, in one aspect, the invention provides a paper treatment process using 
a xylanase of the invention. In one aspect, the xylanase of the invention is applicable both in 
reduction of the need for a chemical bleaching agent, such as chlorine dioxide, and in high 
alkaline and high temperature environments. In one aspect, the xylanase of the invention is a 
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thermostable alkaline endoxylanase which can effect a greater than 25% reduction in the 
chlorine dioxide requirement of kraft pulp with a less than 0.5% pulp yield loss. In one 
aspect, boundary parameters are pH 10, 65-85°C and treatment time of less than 60 minutes 
at an enzyme loading of less than 0.001 wt%. A pool of xylanases may be tested for the 
5 ability to hydrolyze dye-labeled xylan at, for example, pH 10 and 60°C. The enzymes that 
test positive under these conditions may then be evaluated at, for example pH 10 and 70°C. 
Alternatively, enzymes may be tested at pH 8 and pH 10 at 70°C. In discovery of xylanases 
desirable in the pulp and paper industry libraries from high temperature or highly alkaline 
environments were targeted. Specifically, these libraries were screened for enzymes 
10 functioning at alkaline pH and a temperature of approximately 45°C, In another aspect, the 
xylanases of the invention are useful in the pulp and paper industry in degradation of a lignin 
hemicellulose linkage, in order to release the lignin. 

Animal feeds and food or feed additives 

The invention provides methods for treating animal feeds and foods and food 

15 or feed additives using xylanases of the invention, animals including mammals (e.g., 

humans), birds, fish and the like. The invention provides animal feeds, foods, and additives 
comprising xylanases of the invention. In one aspect, treating animal feeds, foods and 
additives using xylanases of the invention can help in the availability of nutrients, e.g., starch, 
protein, and the like, in the animal feed or additive. By breaking down difficult to digest 

20 proteins or indirectly or directly unmasking starch (or other nutrients), the xylanase makes 
nutrients more accessible to other endogenous or exogenous enzymes. The xylanase can also 
simply cause the release of readily digestible and easily absorbed nutrients and sugars. 

When added to animal feed, xylanases of the invention improve the in vivo 
break-down of plant cell wall material partly due to a reduction of the intestinal viscosity 

25 (see, e.g., Bedford et al., Proceedings of the 1st Symposium on Enzymes in Animal Nutrition, 
1993, pp. 73-77), whereby a better utilization of the plant nutrients by the animal is achieved. 
Thus, by using xylanases of the invention in feeds the growth rate and/or feed conversion 
ratio (i.e. the weight of ingested feed relative to weight gain) of the animal is improved. 

The animal feed additive of the invention may be a granulated enzyme product 

30 which may readily be-mixed with feed components. Alternatively, feed additives of the 
invention can form a component of a pre-mix. The granulated enzyme product of the 
invention may be coated or uncoated. The particle size of the enzyme granulates can be 
compatible with that of feed and pre-mix components. This provides a safe and convenient 
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mean of incorporating enzymes into feeds. Alternatively, the animal feed additive of the 
invention may be a stabilized liquid composition. This may be an aqueous or oil-based 
slurry. See, e.g., U.S. Patent No. 6,245,546. 

Xylanases of the present invention, in the modification of animal feed or a 

5 food, can process the food or feed either in vitro (by modifying components of the feed or 
food) or in vivo. Xylanases can be added to animal feed or food compositions containing 
high amounts of xylans, e.g. feed or food containing plant material from cereals, grains and 
the like. When added to the feed or food the xylanase significantly improves the in vivo 
break-down of xylan-containing material, e.g., plant cell walls, whereby a better utilization of 

10 the plant nutrients by the animal (e.g., human) is achieved. In one aspect, the growth rate 
and/or feed conversion ratio (i.e. the weight of ingested feed relative to weight gain) of the 
animal is improved. For example a partially or indigestible xylan-comprising protein is fully 
or partially degraded by a xylanase of the invention, e.g. in combination with another 
enzyme, e.g., beta-galactosidase, to peptides and galactose and/or galactooligomers. These 

15 enzyme digestion products are more digestible by the animal. Thus, xylanases of the 

invention can contribute to the available energy of the feed or food. Also, by contributing to 
the degradation of xylan-comprising proteins, a xylanase of the invention can improve the 
digestibility and uptake of carbohydrate and non-carbohydrate feed or food constituents such 
as protein, fat and minerals. 

20 In another aspect, xylanase of the invention can be supplied by expressing the 

enzymes directly in transgenic feed crops (as, e.g., transgenic plants, seeds and the like), such 
as grains, cereals, corn, soy bean, rape seed, lupin and the like. As discussed above, the 
invention provides transgenic plants, plant parts and plant cells comprising a nucleic acid 
sequence encoding a polypeptide of the invention. In one aspect, the nucleic acid is 

25 expressed such that the xylanase of the invention is produced in recoverable quantities. The 
xylanase can be recovered from any plant or plant part. Alternatively, the plant or plant part 
containing the recombinant polypeptide can be used as such for improving the quality of a 
food or feed, e.g., improving nutritional value, palatability, and rheological properties, or to 
destroy an antinutritive factor. 

30 In one aspect, the invention provides methods for removing oligosaccharides 

from feed prior to consumption by an animal subj ect using a xylanase of the invention. In 
this process a feed is formed having an increased metabolizable energy value. In addition to 
xylanases of the invention, galactosidases, cellulases and combinations thereof can be used. 
In one aspect, the enzyme is added in an amount equal to between about 0.1% and 1% by 
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weight of the feed material. In one aspect, the feed is a cereal, a wheat, a grain, a soybean 
(e.g., a ground soybean) material. See, e.g., U.S. Patent No. 6,399,123. 

In another aspect, the invention provides methods for utilizing xylanase as a 
nutritional supplement in the diets of animals by preparing a nutritional supplement 
containing a recombinant xylanase enzyme comprising at least thirty contiguous amino acids 
of an amino acid of Group B amino acid sequences, and administering the nutritional 
supplement to an animal to increase the utilization of xylan contained in food ingested by the 
animal. 

In yet another aspect, the invention provides an edible pelletized enzyme 
delivery matrix and method of use for delivery of xylanase to an animal, for example as a 
nutritional supplement. Hie enzyme delivery matrix readily releases a xylanase enzyme, 
such as one having an amino acid sequence of group B amino acid sequences, or at least 30 
contiguous amino acids thereof, in aqueous media, such as, for example, the digestive fluid of 
an animal. The invention enzyme delivery matrix is prepared from a granulate edible carrier 
selected from such components as grain germ that is spent of oil, hay, alfalfa, timothy, soy 
hull, sunflower seed meal, wheat midd, and the like, that readily disperse the recombinant 
enzyme contained therein into aqueous media. In use, the edible pelletized enzyme delivery 
matrix is administered to an animal to delivery of xylanase to the animal. Suitable grain- 
based substrates may comprise or be derived from any suitable edible grain, such as wheat, 
corn, soy, sorghum, alfalfa, barley, and the like. An exemplary grain-based substrate is a 
corn-based substrate. The substrate may be derived from any suitable part of the grain, but is 
preferably a grain germ approved for animal feed use, such as com germ that is obtained in a 
wet or dry milling process. The grain germ preferably comprises spent germ, which is grain 
germ from which oil has been expelled, such as by pressing or hexane or other solvent 
extraction. Alternatively, the grain germ is expeller extracted, that is, the oil has been 
removed by pressing. 

The enzyme delivery matrix of the invention is in the form of discrete plural 
particles, pellets or granules. By "granules" is meant particles that are compressed or 
compacted, such as by a pelleuzing, extrusion, or similar compacting to remove water from 
the matrix. Such compression or compacting of the particles also promotes intraparticle 
cohesion of the particles. For example, the granules can be prepared by pelletizing the grain- 
based substrate in a pellet mill. The pellets prepared thereby are ground or crumbled to a 
granule size suitable for use as an adjuvant in animal feed. Since the matrix is itself approved 
for use in animal feed, it can be used as a diluent for delivery of enzymes in animal feed. 
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Preferably, the enzyme delivery matrix is in the form of granules having a 
granule size ranging from about 4 to about 400 mesh (USS); more preferably, about 8 to 
about 80 mesh; and most preferably about 14 to about 20 mesh. If the grain germ is spent via 
solvent extraction, use of a lubricity agent such as com oil may be necessary in the pelletizer, 

5 but such a lubricity agent ordinarily is not necessary if the germ is expeller extracted. In 
other aspects of the invention, the matrix is prepared by other compacting or compressing 
processes such as, for example, by extrusion of the grain-based substrate through a die and 
grinding of the extrudate to a suitable granule size. 

The enzyme delivery matrix may further include a polysaccharide component 

10 as a cohesiveness agent to enhance the cohesiveness of the matrix granules. The 

cohesiveness agent is believed to provide additional hydroxyl groups, which enhance the 
bonding between grain proteins within the matrix granule. It is further believed that the 
additional hydroxyl groups so function by enhancing the hydrogen bonding of proteins to 
starch and to other proteins. The cohesiveness agent may be present in any amount suitable 

15 to enhance the cohesi veness of the granules of the enzyme delivery matrix. Suitable 

cohesiveness agents include one or more of dextrins, maltodextrins, starches, such as com 
starch, flours, cellulosics, hemicellulosics, and the like. For example, the percentage of grain 
germ and cohesiveness agent in the matrix (not including the enzyme) is 78% corn germ meal 
and 20% by weight of com starch. 

20 Because the enzyme-releasing matrix of the invention is made from 

biodegradable materials, the matrix maybe subject to spoilage, such as by molding. To 
prevent or inhibit such molding, the matrix may include a mold inhibitor, such as a 
propionate salt, which may be present in any amount sufficient to inhibit the molding of the 
enzyme-releasing matrix, thus providing a delivery matrix in a stable formulation that does 

25 not require refrigeration. 

The xylanase enzyme contained in the invention enzyme delivery matrix and 
methods is preferably a thermostable xylanase, as described herein, so as to resist inactivation 
of the xylanase during manufacture where elevated temperatures and/or steam may be 
employed to prepare the palletized enzyme delivery matrix. During digestion of feed 

30 containing the invention enzyme delivery matrix, aqueous digestive fluids will cause release 
of the active enzyme. Other types of thermostable enzymes and nutritional supplements that 
are thermostable can also be incorporated in the delivery matrix for release under any type of 
aqueous conditions. 
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A coating can be applied to the invention enzyme matrix particles for many 
different purposes, such as to add a flavor or nutrition supplement to animal feed, to delay 
release of animal feed supplements and enzymes in gastric conditions, and the like. Or, the 
coating maybe applied to achieve a functional goal, for example, whenever it is desirable to 

5 slow release of the enzyme from the matrix particles or to control the conditions under which 
the enzyme will be released. The composition of the coating material can be such that it is 
selectively broken down by an agent to which it is susceptible (such as heat, acid or base, 
enzymes or other chemicals). Alternatively, two or more coatings susceptible to different 
such breakdown agents may be consecutively applied to the matrix particles. 

1 o The invention is also directed towards a process for preparing an enzyme- 

releasing matrix. In accordance with the invention, the process comprises providing discrete 
plural particles of a grain-based substrate in a particle size suitable for use as an enzyme- 
releasing matrix, wherein the particles comprise a xylanase enzyme encoded by an amino 
acid sequence of Group B amino acid sequences or at least 30 consecutive amino acids 

1 5 thereof. Preferably, the process includes compacting or compressing the particles of enzyme- 
releasing matrix into granules, which most preferably is accomplished by pelletizing. The 
mold inhibitor and cohesiveness agent, when used, can be added at any suitable time, and 
preferably are mixed with the grain-based substrate in the desired proportions prior to 
pelletizing of the grain-based substrate. Moisture content in the pellet mill feed preferably is 

20 in the ranges set forth above with respect to the moisture content in the finished product, and 
preferably is about 14-15%. Preferably, moisture is added to the feedstock in the form of an 
aqueous preparation of the enzyme to bring the feedstock to this moisture content. The 
temperature in the pellet mill preferably is brought to about 82°C with steam. The pellet mill 
may be operated under any conditions that impart sufficient work to the feedstock to provide 

25 pellets. The pelleting process itself is a cost-effective process for removing water from the 
enzyme-containing composition. 

In one aspect, the pellet mill is operated with a 1/8 in. by 2 in. die at 100 
lb./min. pressure at 82°C. to provide pellets, which then are crumbled in a pellet mill 
crumbier to provide discrete plural particles having a particle size capable of passing through 

30 an 8 mesh screen but being retained on a 20 mesh screen. 

The thermostable xylanases of the invention can be used in the pellets of the 
invention. They can have high optimum temperatures and high heat resistance such that an 
enzyme reaction at a temperature not hitherto carried out can be achieved. The gene 
encoding the xylanase according to the present invention (e.g. as set forth in any of the 
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sequences in Group A nucleic acid sequences) can be used in preparation of xylanases (e.g. 
using GSSM™ as described herein) having characteristics different from those of the 
xylanases of Group B amino acid sequences (in terms of optimum pH, optimum temperature, 
heat resistance, stability to solvents, specific activity, affinity to substrate, secretion ability, 
5 translation rate, transcription control and the like). Furthermore, a polynucleotide of Group A 
nucleic acid sequences may be employed for screening of variant xylanases prepared by the 
methods described herein to determine those having a desired activity, such as improved or 
modified thermostability or thermotolerance. For example, U.S. Patent No. 5,830,732, 
describes a screening assay for determining thermotolerance of a xylanase. 

1 0 Waste treatment 

The xylanases of the invention can be used in a variety of other industrial 
applications, e.g., in waste treatment. For example, in one aspect, the invention provides a 
solid waste digestion process using xylanases of the invention. The methods can comprise 
reducing the mass and volume of substantially untreated solid waste. Solid waste can be 

1 5 treated with an enzymatic digestive process in the presence of an enzymatic solution 

(including xylanases of the invention) at a controlled temperature. This results in a reaction 
without appreciable bacterial fermentation from added microorganisms. The solid waste is 
converted into a liquefied waste and any residual solid waste. The resulting liquefied waste 
can be separated from said any residual solidified waste. See e.g., U.S. Patent No. 5,709,796. 

20 Oral care products 

The invention provides oral care product comprising xylanases of the 
invention. Exemplary oral care products include toothpastes, dental creams, gels or tooth 
powders, odontics, mouth washes, pre- or post brushing rinse formulations, chewing gums, 
lozenges, or candy. See, e.g., U.S. Patent No. 6,264,925. 

25 Brewing and fermenting 

The invention provides methods of brewing (e.g., fermenting) beer comprising 
xylanases of the invention. In one exemplary process, starch-containing raw materials are 
disintegrated and processed to form a malt. A xylanase of the invention is used at any point 
in the fermentation process. For example, xylanases of the invention can be used in the 

30 processing of barley malt. The major raw material of beer brewing is barley malt. This can 
be a three stage process. First, the barley grain can be steeped to increase water content, e.g., 
to around about 40%. Second, the grain can be germinated by incubation at 15 to 25°C for 3 
to 6 days when enzyme synthesis is stimulated under the control of gibberellins. In one 
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aspect, xylanases of the invention are added at this (or any other) stage of the process. 
Xylanases of the invention can be used in any beer or alcoholic beverage producing process, 
as described, e.g., in U.S. Patent No. 5,762,991; 5,536,650; 5,405,624; 5,021,246; 4,788,066. 

In one aspect, an enzyme of the invention is used to improve filterability and 

5 wort viscosity and to obtain a more complete hydrolysis of endosperm components. Use of 
an enzyme of the invention would also increase extract yield. The process of brewing 
involves germination of the barley grain (malting) followed by the extraction and the 
breakdown of the stored carbohydrates to yield simple sugars that are used by yeast for 
alcoholic fermentation. Efficient breakdown of the carbohydrate reserves present in the 

1 0 barley endosperm and brewing adjuncts requires the activity of several different enzymes. 

In one aspect, an enzyme of the invention has activity in slightly acidic pH 
(e.g., 5.5-6.0) in, e.g., the 40°C to 70°C temperature range; and, in one aspect, with 
inactivation at 95°C. Activity under such conditions would be optimal, but are not an 
essential requirement for efficacy. In one aspect, an enzyme of the invention has activity 

15 between 40-75° C, and pH 5.5-6.0; stable at 70° for at least 50 minutes, and, in one aspect, is 
inactivated at 96-100 °C. Enzymes of the invention can be used with other enzymes, e.g., 
beta-l,4-endoglucanases and amylases. 

Medical and research applications 

Xylanases of the invention can be used as antimicrobial agents due to their 
20 bacteriolytic properties. Xylanases of the invention can be used to eliminating or protecting 
animals from salmonellae, as described in e.g., PCT Application Nos. WO0049890 and 
WO9903497. 

Other industrial applications 

Xylanases of the invention can be used, including Group B amino acid 
25 sequences are used in a wide variety of food, animal feed and beverage applications. New 
xylanases are discovered by screening existing libraries and DNA libraries constructed from 
diverse mesophilic and moderately thermophilic locations as well as from targeted sources 
including digestive flora, microorganisms in animal waste, soil bacteria and highly alkaline 
habitats. Biotrap and primary enrichment strategies using arabinoxylan substrates and/or 
30 non-soluble polysaccharide fractions of animal feed material are also useful. 

Two screening formats (activity-based and sequence-based) are used in the 
discovery of novel xylanases. The activity-based approach is direct screening for xylanase 
activity in agar plates using a substrate such as AZO-xylan (Megazyme). Alternatively a 
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sequence-based approach maybe used, which relies on bioinformatics and molecular biology 
to design probes for hybridization and biopanning. See, for example, U.S. Patents No. 
6,054,267, 6,030,779, 6,368,798, 6,344,328. Hits from the screening are purified, sequenced, 
characterized (for example, determination of specificity, temperature and pH optima), 

5 analyzed using bioinformatics, subcloned and expressed for basic biochemical 

characterization. These methods may be used in screening for xylanases useful in a myriad 
of applications, including dough conditioning and as animal feed additive enzymes. 

In characterizing enzymes obtained from screening, the exemplary utility in 
dough processing and baking applications may be assessed. Characterization may include, 

10 for example, measurement of substrate specificity (xylan, arabinoxylan, CMC, BBG), 

temperature and pH stability and specific activity. A commercial enzyme may be used as a 
benchmark. In one aspect, the enzymes of the invention have significant activity at pH > 7 
and 25-35° C, are inactive on insoluble xylan, are stable and active in 50-67% sucrose. 
In another aspect, utility as feed additives may be assessed from 

1 5 characterization of candidate enzymes. Characterization may include, for example, 

measurement of substrate specificity (xylan, arabinoxylan, CMC, BpG), temperature and pH 
stability, specific activity and gastric stability. In one aspect the feed is designed for a 
monogastric animal and in another aspect the feed is designed for a ruminant animal. In one 
aspect, the enzymes of the invention have significant activity at pH 2-4 and 35-40°C, a half- 

20 life greater than 30 minutes in gastric fluid, formulation (in buffer or cells) half-life greater 
than 5 minutes at 85°C and are used as a monogastric animal feed additive. In another 
aspect, the enzymes of the invention have one or more of the following characteristics: 
significant activity at pH 6.5-7.0 and 35-40°C, a half-life greater than 30 minutes in rumen 
fluid, formulation stability as stable as dry powder and are used as a ruminant animal feed 

25 additive. 

Enzymes are reactive toward a wide range of natural and unnatural substrates, 
thus enabling the modification of virtually any organic lead compound. Moreover, unlike 
traditional chemical catalysts, enzymes are highly enantio- and regio-selective. The high 
degree of functional group specificity exhibited by enzymes enables one to keep track of each 
30 reaction in a synthetic sequence leading to a new active compound. Enzymes are also capable 
of catalyzing many diverse reactions unrelated to their physiological function in nature. For 
example, peroxidases catalyze the oxidation of phenols by hydrogen peroxide. Peroxidases 
can also catalyze hydroxylation reactions that are not related to the native function of the 
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enzyme. Other examples are xylanases which catalyze the breakdown of polypeptides. In 
organic solution some xylanases can also acylate sugars, a function unrelated to the native 
function of these enzymes. 

The present invention exploits the unique catalytic properties of enzymes. 

5 Whereas the use of biocatalysts (i.e., purified or crude enzymes, non-living or living cells) in 
chemical transformations normally requires the identification of a particular biocatalyst that 
reacts with a specific starting compound, the present invention uses selected biocatalysts and 
reaction conditions that are specific for functional groups that are present in many starting 
compounds. Each biocatalyst is specific for one functional group, or several related 

1 0 functional groups and can react with many starting compounds containing this functional 
group. The biocatalytic reactions produce a population of derivatives from a single starting 
compound. These derivatives can be subjected to another round of biocatalytic reactions to 
produce a second population of derivative compounds. Thousands of variations of the 
original compound can be produced with each iteration of biocatalytic derivatization. 

1 5 Enzymes react at specific sites of a starting compound without affecting the 

rest of the molecule, a process which is very difficult to achieve using traditional chemical 
methods. This high degree of biocatalytic specificity provides the means to identify a single 
active compound within the library. The library is characterized by the series of biocatalytic 
reactions used to produce it, a so-called !f biosynthetic history". Screening the library for 

20 biological activities and tracing the biosynthetic history identifies the specific reaction 

sequence producing the active compound. The reaction sequence is repeated and the structure 
of the synthesized compound determined. This mode of identification, unlike other synthesis 
and screening approaches, does not require immobilization technologies and compounds can 
be synthesized and tested free in solution using virtually any type of screening assay. It is 

25 important to note, that the high degree of specificity of enzyme reactions on functional 
groups allows for the tf tracking ,, of specific enzymatic reactions that make up the 
biocatalytically produced library. 

Many of the procedural steps are performed using robotic automation enabling 
the execution of many thousands of biocatalytic reactions and screening assays per day as 

30 well as ensuring a high level of accuracy and reproducibility. As a result, a library of 
derivative compounds can be produced in a matter of weeks which would take years to 
produce using current chemical methods. (For further teachings on modification of 
molecules, including small molecules, see PCT/US94/09174). 
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The invention will be further described with reference to the following 
examples; however, it is to be understood that the invention is not limited to such examples. 



EXAMPLES 

5 EXAMPLE 1: PLATE BASED ENDOGLYCOSIDASE ENZYME DISCOVERY: 
EXPRESSION SCREENING 

Titer determination of Lambda Library : Add L0 jiL of Lambda Zap Express amplified 
library stock to 600|aL R coli MRF cells (OD 6 oo=l .0). Dilute MRF stock with lOmM 

10 MgS0 4 . Incubate mixture at 37°C for 15 minutes, then transfer suspension to 5-6mL of NZY 
top agar at 50 °C and gently mix. Immediately pour agar solution onto large (150mm) NZY 
media plate and allow top agar to solidify completely (approximately 30 minutes). Invert the 
plate. Incubate the plate at 39°C for 8-12 hours. (The number of plaques is approximated. 
Phage titer determined to give 50,000 pfu/plate. Dilute an aliquot of Library phage with SM 

15 buffer if needed.) 

Substrate screening : Add Lambda Zap Express (50,000 pfu) from amplified library to 600jiL 
olE. coli MRF' cells (ODaxpl.O) and incubate at 37°C for 15 minutes. While phage/cell 
suspension is incubating, add LOmL of desired polysaccharide dye-labeled substrate (usually 
1-2% w/v) to 5.0mL NZY top agar at 50°C and mix thoroughly. (Solution kept at 50°C until 

20 needed.) Transfer the cell suspension to substrate/top agar solution and gently mix. 

Immediately pour solution onto large (150mm) NZY media plate. Allow top agar to solidify 
completely (approximately 30 minutes), then invert plate. Incubate plate at 39°C for 8-12 
hours. Observe plate for clearing zones (halos) around plaques. Core plaques with halos out 
of agar and transfer to a sterile micro tube. (A large bore 200|iL pipette tip works well to 

25 remove (core) the agar plug containing the desired plaque.) Resuspend phage in 500|iL SM 
buffer. Add 20|iL chloroform to inhibit any further cell growth. 

Isolation of pure clones : Add 5|iL of resuspended phage suspension to 500jaL of E. coli 
MRF cells (OD 60 o=1.0). Incubate at 37°C for 15 minutes. While phage/cell suspension is 
incubating, add 600pL of desired polysaccharide dye-labeled substrate (usually 1-2% w/v) to 
30 3.0mL NZY top agar at 50°C and mix thoroughly. (Solution kept at 50°C until needed.) 
Transfer cell suspension to substrate/top agar solution and gently mix. Immediately pour 
solution onto small (90mm) NZY media plate and allow top agar to solidify completely 

195 



WO 03/106654 PCT/US03/19153 

(approximately 30 minutes), then invert plate. Incubate plate at 39°C for 8-12 hours. Plate 
observed for a clearing zone (halo) around a single plaque (pure clone). (If a single plaque 
cannot be isolated, adjust titer and replate phage suspension.) Phage are resuspended in 
500uL SM buffer and 20p.L Chloroform is added to inhibit any further cell growth. 

5 Excision of pure clone : Allow pure phage suspension to incubate at room temperature for 2 
to 3 hours or overnight at 4°C. Add 100pX of pure phage suspension to 200uL E. coli MRF 
cells (OD<5oo=1.0). Add l.OuL of ExAssist helper phage (>1 x 10 6 pfu/mL; Stratagene). 
Incubate suspension at 37°C for 15 minutes. Add 3.0 mL of 2 x YT media to cell suspension. 
Incubate at 37°C for 2-2.5 hours while shaking. Transfer tube to 70°C for 20 minutes. 

10 Transfer 50-1 00 uL of phagemid suspension to a micro tube containing 200uL of E. coli Exp 
505 cells (OD 6 oo=1.0). Incubate suspension at 37°C for 45 minutes. Plate 100 p,L of cell 
suspension on LBkan so media (LB media with Kanamycin 50|ig/mL). Incubate plate at 37°C 
for 8-12 hours. Observe plate for colonies. Any colonies that grow contain the pure 
phagemid. Pick a colony and grow a small (3-1 OmL) liquid culture for 8- 12 hours. Culture 

1 5 media is liquid LB kan so- 

Activity verification : Transfer l.OmL of liquid culture to a sterile micro tube. Centrifuge at 
13200 ipm (16000 g's) for 1 minute. Discard supernatant and add 200uL of phosphate buffer 
pH 6.2. Sonicate for 5 to 10 seconds on ice using a micro tip. Add 200 uL of appropriate 
substrate, mix gently and incubate at 37 °C for 1 .5-2 hours. A negative control should also be 
20 run that contains only buffer and substrate. Add 1 .OmL absolute ethanol (200 proof) to 
suspension and mixed. Centrifuge at 13200 rpm for 10 minutes. Observe supernatant for 
color. Amount of coloration may vary, but any tubes with more coloration than control is 
considered positive for activity. A spectrophotometer can be used for this step if so desired 
or needed. (For Azo-xylan, Megazyme, read at 590nm). 

25 RFLP of pure clones from same Libraries : Transfer l.OmL of liquid culture to a sterile micro 
tube. Centrifuge at 13200 rpm (16000 g's) for 1 minute. Follow QIAprep spin mini kit 
(Qiagen) protocol for plasmid isolation and use 40 uL holy water as the elution buffer. 
Transfer 10 uL plasmid DNA to a sterile micro tube. Add 1.5uL Buffer 3 (New England 
Biolabs), 1 .5uL 1 00X BS A solution (New England Biolabs) and 2.0uL holy water. To this 

30 add 1 .OuL Not 1 and 1 .OuL Pst 1 restriction endonucleases (New England Biolabs). 

Incubate for 1.5 hours at 37°C. Add 3.0pX 6X Loading buffer (mvitrogen). Run 15uL of 
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digested sample on a 1.0% agarose gel for 1-1.5 hours at 120 volts. View the gel with a gel 
imager. Perform sequence analysis on all clones with a different digest pattern- 
Table 6 describes various properties of exemplary enzymes of the invention. 

Table 6 



SEQ ID NO. 


Topt* 


Tstab 


pHopt* 


Significant 


PI 


Mw 






activities 






151, 152 


50°C 


<1 minat65°C 


5.5-9.0 


AZOxylan 


5.7 


40.2 


155,156 


50°C 


<1 minat65°C 


5.5-8.0 


AZO-xylan 


8.8 


62.7 


169, 170 


50°C 


>1 minat65°C;<1 min 


7.0 


AZO-xylan 


8.7 


36.7 




at85°C 










195. 196 


50°C 


>1 minat65°C< 10 min, 


5.5 


AZO-xylan 


8.5 


36.7 




< 1 min 85°C 










215, 216 


85°C 


<3minat85°C 


5.5-8.0 


AZO-xylan 


8.6 


34.8 


47, 48 


50°C 


<0.5minat65°C;<1 


7.0-8.0 


AZO-xylan 


6.2 


40.3 




min at 85°C 










191,192 


3 85°C 


> 30 sec at 85°C 


5.5 


AZO-xylan 


7.8 


34.6 


247, 248 


50°C 


<1 minat65°C 


8.0 


AZO-xylan 


9.4 


43.5 


7,8 


50°C 


> 1 min 85°C < 5 min 


5.5 


AZO-xylan 


4.5 


55.3 


221,222 


50-65°C 


<1 minat75°C 


5.5 


AZO-xylan 


8.3 


34.6 


163,164 


65°C 


<1 minat65°C 


7.0 


AZO-xylan 


6.3 


36.0 


19, 20 


37°C 


<5 min at 50*C 


7.0 - 8.0 


AZO-xylan 


9.2 


41.5 


87, 88 


37-50°C 


<1 min at85°C 


8.0 


AZO-xylan 


5.2 


36.7 


81,82 


50°C 


< 1 min at 65°C 


7.0-9.0 


AZO-xylan 


5.3 


38.8 


91,92 


50°C 


<1 minat65°C 


7-8 


AZO-xylan, AZO- 


5.4 


39.0 








CMC 






61,62 


37°C 


<5min at50°C 


7.0-9.0 


AZO-xylan, AZO- 


5.4 


40 








CMC 






159,160 


85°C 


< 30 sec at 85°C 


5.5 


AZO-xylan 


8.3 


34.5 


233, 234 


50°C 


> 30 sec < 1 min at 65°c; 


7.0 


AZO-xylan 


8.5 


35.1 




< 1 min at 85 G C 










203, 204 


50 - 65°C 


> 1 min at 65°C < 5 min, 


5.5 


AZO-xylan 


9.5 


21.7 




< 1 min 85°C 










181,182 


3 85°C 


> 1 min at 85°C 


5.5-8.0 


AZO-xylan 


8.8 


35.5 


227, 228 


65°C 


>1 min at 85°C < 5 min 


5.5-7.0 


AZO-xylan 


7.8 


25.8 


45, 46 


3 45°C 


3 5 min 45 8 C, <0.5 min 


>5.5 


AZO-xylan 


6.7 


40.4 




55°C 










231,232 


65°C 


>10min at 50°C 


5.5-7.0 


AZO-xylan 


8.4 


31.4 


129,130 


65°C 


<1minat75°C 


5.5 


AZO-xylan 


5.1 


116 


93, 94 


50°C 


< 1 min at 60°C 


8.0-9.0 


AZO-xylan 


5.3 


39.1 


189, 190 


65°C 


<1 minat65°C 


5.5 


AZO-xylan 


9.2 


20.3 


49, 50 


70°C 


<20 min 70°C 


>5 


AZO-xylan 


5.7 


38.9 


85, 86 


50°C 


>5 min at 85°C 


5.5-7.0 


AZO-xylan 


6.1 


48.4 


99,100 


50°C 


<1 minat75°C 


5.5 - 8.0 


AZO-xylan 


10.8 


36.6 


123, 124 


3 85°C 


<30 sec 100 °C 


5.5-7.0 


AZO-xylan 


6.1 


44.1 


249, 250 


45°C 


>1 min 75°C<10min 


5.5 


AZO-xylan 


5.3 


93 


167, 168 


85°C 


< 5 min 85°C 


5.5 


AZO-xylan 


9.5 


21.7 


207, 208 


75°C 


< 5 min 65 °C 


5.5 


AZO-xylan 


9.1 


20.4 


251,252 


65-75°C 


< 1 min 85 °C 


5.5 


AZO-xylan 


8.8 


20.4 



11,12 


<90°C 


<40 min 70°C 


>6 


AZO-xylan 


6.8 


43.9 


177, 178 


65°C 


<1 min at75°C 


5.5 


AZO-xylan 


8.7 


44.6 


9,10 


50°C 


<1min at 65°C 


5.5 - 7.0 


AZO-xylan 


4.9 


46.1 


43, 44 


37°C 


unstable 


5.5-7.0 


AZO-xylan 


4.9 


39.1 


113,114 


65 - 75°C 


< 1 min at 75°C 


5.5 - 8.0 


AZO-xylan 


5 


41.2 
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SEQ ID NO. 


Topt* 


Tstab" 


pHopt* 


Significant 


Pi 


M w 


Notes 








activities 








75, 76 


50°C 


< 1 min 85°C 


7.0 - 9.0 


AZOxylan 


4.7 


39.4 


111,112 


37°C 


>10 min 50°C 


7-8 


AZO-xylan 


5.6 


41.0 


117,118 


37°C 


unstable 


7-8 


AZO-xylan 


9.1 


53.3 


115,116 


- 


- 


- 


AZO-xylan 


8.9 


50.8 


125, 126 


37°C 


- 


8.0 


AZO-xylan 


5.3 


41.1 


137, 138 


50°C 


< 30 sec at 65°C 


5.5 


AZO-xylan 


5.7 


38.5 


69, 70 


3 85°C 


< 5 min at 85°C 


5.5-9.0 


AZO-xylan 


6.4 


58.0 


205, 206 


50°C 


<1min at 65°C 


5.5-8 


AZO-xylan 


4.3 


35.1 


211, 212 


50°C 


<1min at 65°C 


5.5 


AZO-xylan 


4.4 


35.4 


197^ 198 


65°C 


<1 minat65°C 


5.5 


AZO-xylan 


8.8 


20.1 


31,32 


37°C 


unstable 


7.0 


AZO-xylan 


5.1 


54.4 


13, 14 


50°C 


<1 minat65°C 


7 


AZO-xylan 


5.5 


40.0 


65,66 


50°C 


<1 minat65°C 


5.5 


AZO-xylan, AZO- 


4.8 


55.5 








CMC 






257, 258 


37°C 


unstable 


5.5 


AZO-xylan, AZO- 


5.3 


100.8 








bariey p-g!ucan, 
AZO-CMC 








50°C 


<1 min at 65°C 


7.0 


AZO-xylan 


4.8 


56.7 


185 186 


50-75°C 


< 1 min at 80°C 


5.5 


AZO-xylan 


8.6 


23.2 


243, 244 


75°C 


>0.5 min @ 85°C 


5.5 


AZO-xylan 


8.8 


44.4 


77, 78 


50°C 


< 5 min at 65°C, < 1 min 


5.5 


AZO-xylan 


5.3 


44.5 




85°C 












37°C 


3 30 min 55*C, < 5 min 


5.5 


AZO-xylan 


8.7 


20.6 ****** 




75°C 










109, 110 


65°C 


>0.5 min @ 75°C 


5.5 


AZO-xylan 


4.9 


45.2 


193, 194 


65°C 


< 1 min at 75°C 


5.5 


AZO-xylan 


5.4 


29.1 


173, 174 


65°C 


< 1 minat80°C 


7.0 


AZO-xylan 


7.6 


51.6 


59, 60 


37°C 


<1minat65°C 


7.0 


AZO-xylan 


6.6 


42.5 


101,102 


50°C 


>0.5 min @ 65°C 


7.0 


AZO-xylan 


8.7 


41.1 


55, 56 


37 P C 


> 5 min at 50°C; < 1 min 


7.0 


AZO-xylan 


6.5 


41.8 




at85°C 










15,16 


50°C 


<1 minat65°C 


7.0 


AZO-xylan 


6.4 


40 2 


131,132 








AZO-xylan 


5.6 


42.1 


145, 146 


65-85°C 


<1 minat85°C 


5.5 


AZO-xylan 


5.2 


43.7 


219, 220 






5.5 


AZO-xylan 


6.6 


34.5 


253, 254 


65°C 


> .5 min at 85°C 


5.5-7 


AZO-xylan 


7.8 


34.6 


255, 256 


65°C 


> 1 min 65°C <3 min 


5.5-7.0 


AZO-xylan 


8.3 


35.0 



* pH or temperature optima determined by initial rates using AZO-AZO-xylan as a substrate 
** thermal stability, time that enzyme retained significant activity (approx. > 50 %) 
*** Dough conditioning 

**** GSSM™ parent for thermal tolerance evolution for animal feed applications 
***** N35D mutation made to increase low pH activity- based on public knowledge- mutant 
enzyme's relative activity at pH 4 significantly increased 
5 ****** Dough conditioning 

EXAMPLE 2: GSSM™ SCREEN FOR THERMAL TOLERANT MUTANTS 

The following example describes an exemplary method for screening for 
1 0 thermally tolerant enzymes. 

Master Plates: Prepare plates for a colony picker by labeling 96 well plates and aliquoting 
200 pL LB Amp 100 into each well. (~20ml needed per 96 well plate). After the plates are 
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returned from the picker, remove media from row 6 from plate A. Replace with an 
inoculation of SEQ ID NO: 1 89, Place in a humidified 37°C incubator overnight. 
Assay Plates : Pin tool cultures into a fresh 96 well plate (200 pL /well LB AmplOO). 
Remove plastic cover and replace with Gas Permeable Seal. Place in a humidified incubator 
5 overnight. Remove the seal and replace plastic lid. Spin cultures down in tabletop centrifuge 
at 3000 rpm for 10 min. Remove supernatant by inversion onto a paper towel. Aliquot 45 
\xL Cit-Phos-KCl buffer pH 6 into each well. Replace the plastic lid with an aluminum plate 
seal. Use a roller to get a good seal. Resuspend cells in a plate shaker at level 6-7 for -30 
seconds. 

10 Place the 96 well plate in 80°C incubator for 20 minutes. Do not stack. 

Thereafter, immediately remove plates to ice water to cool for a few minutes. Remove the 
aluminum seal and replace with a plastic lid. Add 30 pL of 2 % Azo-xylan. Mix as before 
on the plate shaker. Incubate 37°C in a humidified incubator overnight. 

Add 200 \iL ethanol to each well and pipette up and down a couple of times to 

15 mix. As an alternative to changing tips each time, rinse in an ethanol wash and dry by 

expelling into a paper towel. Spin the plates at 3000 rpm for 10 minutes. Remove 1 00 \iL of 
supernatant to a fresh 96 well plate. Read the OD590. 

EXAMPLE 3: GSSM™ ASSAY FOR HIT VERIFICATION OF THERMAL TOLERANT 
MUTANTS 

20 The following example describes an exemplary method for assaying for 

thermally tolerant enzymes. 

Pin tool or pick clones into duplicate 96 well plates (200ul /well LB AmplOO). 
Remove the plastic cover and replace with a Gas Permeable Seal. Place in a humidified 
incubator overnight. Remove the Seal and replace with a plastic lid. Pintool the clones to 
25 solid agar. Spin cultures down in tabletop centrifuge at 3000 rpm for 10 min. Remove the 
supernatant by inversion onto a paper towel. Aliquot 25 \x\ BPER/Lysozyme/DNase solution 
(see below) into each well. Resuspend cells in a plate shaker on level 6-7 for -30 seconds. 

Incubate the plate on ice for 15 minutes. Add 20 pL of Cit-Phos-KCl buffer 
pH 6 into each well Replace the plastic lid with an aluminum plate seal. Use a roller to get a 
30 good seal. Mix on a plate shaker at level 6-7 for -30 seconds. 

Place one 96 well plate in an 80°C incubator for 20 minutes and the other at 
37°C. Do not stack. Immediately remove the plates to watery ice to cool for a few minutes 
(use a large plastic tray if needed). Remove the aluminum seal. Add 30 pi of 2% Azo-xylan. 
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Seal with a plastic gas permeable seal. Mix as before on the plate shaker. Incubate a set of 

37°C and 80°C plates in humidified incubator at 37°C for 2 hours and another set for 4 hours. 

After incubation, let the plate sit for ~5 minutes at room temperature. Add 200 

jxL efhanol to each well and pipette up and down a couple of times to mix. Instead of 

5 changing tips each time, rinse in an ethanol wash and dry by expelling into a paper towel. 

But, use a new set of tips for each clone. Spin plates at 3000 rpm 10 minutes. Remove 100 

jxL of supernatant to a fresh 96 well plate. Read OD590. 

BPER/Lysozyme/DNase solution (4.74 mL total): 
4.5mLBPR 

10 200 ^iL 10 mg/mL Lysozyme (made fresh in pH 6 Cit-phos-bufTer) 
40 \iL 5 mg/mL DNase I (made fresh in pH 6 Cit-phos buffer 

EXAMPLE 4: Xvlanase assay with wheat arabinoxvlan as substrate 

15 

The following example describes an exemplary xylanase assay that can be 
used, for example, to determine is an enzyme is within the scope of the invention. 

SEQ ID NOS: 11, 12, 69, 70, 77, 78, 113, 114, 149, 150, 159, 160, 163, 164, 
167, 168, 181, 182, 197, and 198 were subjected to an assay at pH 8 (Na-phosphate buffer) 
20 and 70°C using wheat arabinoxylan as a substrate. The enzymes were characterized as set 
forth in Table 7. 
Table 7 



SEQID 
NOS: 


Protein 

Concentration 
(mg/ml) 


volume of 
lysate added 
to each vial 


#of vials 


Units/ml* 


protein 
(mg/mL) 


U/mg 


11,12 


42 


0.5 


10 


163 


22.0 


7.4 


113,114 


37 


0.6 


10 


66 


22.0 


3.0 


163, 164 


35 


0.6 


10 


25 


22.0 


1.1 


197, 198 


23 


1.0 


10 


31 


22.0 


1.4 


167, 168 


10 


2.2 


10 


228 


22.0 


10.4 


77, 78 


47 


0.5 


10 


29 


22.0 


1.3 


69, 70 


18 


1.3 


10 


36 


22.0 


1.7 


181,182 


28 


0.8 


10 


24 


22.0 


1.1 


159, 160 


25 


! 0.9 


10 


43 


22.0 


2.0 


149, 150 


42 


0.5 


10 


24 


22,0 


1.1 



♦Based on addition of 1 mL of water to each sample. 



25 Units are umoles xylose released per minute based on a reducing sugar assay. 

EXAMPLE 5: Generation of an exemplary xvlanase of the invention 

The following example describes the generation of an exemplary xylanase of 
the invention using gene site-saturation mutagenesis (GSSM™) technology, designated the 
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"9x" variant or mutant (the nucleic acid as set forth in SEQ ID NO:377, the polypeptide 
sequence as set forth in SEQ ID NO:378). 

GSSM™ was used to create a comprehensive library of point mutations in the 
exemplary SEQ ID NO:190, "wild-type" xylanase (encoded by SEQ ID NO:189). The 
5 xylanase thermotolerance screen described above identified nine single site amino acid 

mutants (Figure 6A) (D8F, Q11H, N12L, G17I, G60H, P64V, S65V, G68A & S79P) that had 
improved thermal tolerance relative to the wild type enzyme (as measured following a heat 
challenge at 80°C for 20 minutes). Wild-type enzyme and all nine single site amino acid 
mutants were produced in E. coli and purified utilizing an N-terminal hexahistidine tag. 

10 There was no noticeable difference in activity due to the tag. 

Figure 6 illustrates the nine single site amino acid mutants of "variant 9x", or, 
as set forth in SEQ ID NO:378 (encoded by SEQ ID NO:377), as generated by Gene Site 
Saturation Mutagenesis (GSSM™) of the exemplary SEQ ID NO:190 "wild-type" enzyme 
(encoded by SEQ ID NO: 1 89). Figure 6A is a schematic diagram illustrating position, 

15 numbering and the amino acid change for the thermal tolerant point mutants of the ''wild- 
type" gene (SEQ ID NO:190, encoded by SEQ ID NO:189). A library of all 64 codons was 
generated for every amino acid position in the gene (-13,000 mutants) and screened for 
mutations that increased thermal tolerance. The "9X" variant was generated by combining all 
9 single-site mutants into one enzyme. The corresponding melting temperature transition 

20 midpoint (Tm) determined by DSC for each mutant enzyme and the "9X" (SEQ ID NO:378) 
variant is shown on the right. Figure 6B illustrates the unfolding of the "wild-type" (SEQ ID 
NO:190) and "9X" (SEQ ID NO:378) "variant/mutant" enzymes was monitored by DSC at a 
scan rate of l°C/min. Baseline subtracted DSC data were normalized for protein 
concentration. 

25 Xylanase activity assays 

Enzymatic activities were determined using 400 ocL of 2% Azo-xylan as 
substrate in 550 ocL of CP (citrate-phosphate) buffer, pH 6.0 at the indicated temperatures. 
Activity measurements as a function of pH were determined using 50 mM Britton and 
Robinson buffer solutions (pH 3.0, 5.0, 6.0, 7.0, 8.0 and 9.0) prepared by mixing solutions of 

30 0. 1 M phosphoric acid solution, 0. 1 M boric acid and 0. 1 M acetic acid followed by pH 
adjustment with 1 M sodium hydroxide. Reactions were initiated by adding 50 ocL of 0.1 
mg/ml of purified enzyme. Time points were taken from 0 to 15 minutes where 50 ocL of 
reaction mixture was added to 200 ocL of precipitation solution (100% ethanol). When all 
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time points had been taken, samples were mixed, incubated for 10 minutes and centrifuged at 
3000 g for 10 minutes at 4°C. Supernatant (150 ocL) was aliquoted into a fresh 96 well plate 
and absorbance was measured at 590 run. A590 values were plotted against time and the initial 
rate was determined from the slope of the line. 

5 Differential Scanning Calorimetty (DSC). 

Calorimetry was performed using a Model 6100 Nano II DSC apparatus 
(Calorimetry Sciences Corporation, American Fork, UT) using the DSCRun software 
package for data acquisition, CpCalc for analysis, CpConvert for conversion into molar heat 
capacity from microwatts and CpDeconvolute for deconvolution. Analysis was carried out 

1 0 with lmg/ml recombinant protein in 20 mM potassium phosphate (pH 7.0) and 100 mM KC1 
at a scan rate of loC/min. A constant pressure of 5 atm was maintained during all DSC 
experiments to prevent possible degassing of the solution on heating. The instrumental 
baseline was recorded routinely before the experiments with both cells filled with buffer. 
Reversibility of the thermally induced transitions was tested by reheating the solution in the 

15 calorimeter cell immediately after cooling the first run. 

Thermal tolerance determination. 

All enzymes were analyzed for thermal tolerance at 80°C in 20 mM potassium 
phosphate (pH 7.0) and 100 mM KC1. The enzymes were heated at 80°C for 0, 5, 10 or 30 
minutes in thin-walled tubes and were cooled on ice. Residual activities were determined 
20 with Azo-xylan as substrate using the assay described above for activity measurement. 

Polysaccharide Fingerprinting. 

Polysaccharide fingerprints were determined by polysaccharide analysis using 
carbohydrate gel electrophoresis (PACE). Beechwood xylan (0.1 mg/mL, 100 ocL, Sigma, 
Poole, Dorset, UK) or xylooUgosaccharides (1 mM, 20 ocL, Megazyme, Wicklow, Ireland) 

25 were treated with enzyme (1-3 ocg) in a total volume of 250 ocL for 1 6 hours. The reaction 
was buffered in 0.1 M ammonium acetate pH 5.5. Controls without substrates or enzymes 
were performed under the same conditions to identify any unspecific compounds in the 
enzymes, polysaccharides/oligosaccharides or labeling reagents. The reactions were stopped 
by boiling for 20 min. Assays were independently performed at least 2 times for each 

30 condition. Derivatization using ANTS (8-aminonaphthalene-l,3,6-trisulfonic acid, Molecular 
Probes, Leiden, The Netherlands), electrophoresis and imaging were carried out as described 
(Goubet, F., Jackson, P., Deery, M. and Dupree, P. (2002) Anal. Biochem. 300, 53-68). 
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Fitness Calculation. 

The fitness (Fn), for a given enzyme variant, n, was calculated by equally 
weighting increase in denaturation temperature transition midpoint (Tm) and increase (or 
decrease) in enzymatic activity relative to the largest difference in each parameter across all 
5 variants: Fn = Ftii + Fvn , where Fth = Tm fitness factor of the variant and Fvn = activity fitness 
factor of the variant. The fitness factors for each (Tm and activity) are relative to the largest 
difference in Tm or rate across all of the variants. Fin = (Tm - TmL) / (TmH- TmL) where Tmn is 
the Tmfor the given variant, n, and TmL is the lowest Tm across all variants and TmH the highest 
Tm across all variants and Fvn = (Vn - Vl) / (Vh- Vl) where Vn is the relative rate for the 
10 given variant, n, and Vl is the lowest rate across all variants and Vh the highest rate across all 
variants. 

Evolution by the GSSM™ method. 

GSSM™ technology was used to create a comprehensive library of point 
mutations in the exemplary xylanase of the invention SEQ ID NO: 190 (encoded by SEQ ID 

15 NO: 1 89); including the exemplary xylanase of the invention SEQ ID NO:378 (encoded by 
SEQ ID NO:377). The xylanase thermotolerance screen described above identified nine 
single site amino acid mutants (Figure 6A), D8F, Q11H, N12L, G17I, G60H, P64V, S65V, 
G68A & S79P, that had improved thermal tolerance relative to the exemplary "wild type" 
enzyme SEQ ID NO: 190 (encoded by SEQ ID NO: 189), as measured following a heat 

20 challenge at 80°C for 20 minutes. Wild-type enzyme and all nine single site amino acid 
mutants were produced in E. coli and purified utilizing an N-terminal hexahistidine tag. 
There was no noticeable difference in activity due to the tag. 

To determine the effect of the single amino acid mutations on enzymatic 
activity, all nine mutants were purified and their xylanase activity (initial rates at the wild- 

25 type temperature optimum, 70°C) was compared to that of the exemplary SEQ ID NO:190 
"wild-type" enzyme. Enzyme activities were comparable to wild type (initial rate normalized 
to 1.0) for D8F, N12L, G17I, G60H, P64V, S65V G68A and S79P mutants (relative initial 
rates 0.65, 0.68, 0.76, 1.1, 1.0, 1.2, 0.98 and 0.84 respectively) confirming that these 
mutations do not significantly alter the enzymatic activity. Initial rates were measured 3 or 

30 more times and variance was typically less than 10 %. In contrast to these eight mutants, a 
notable reduction in enzymatic activity was observed for the best thermal tolerant, single site 
mutant, Q11H (relative initial rate 0.35). 
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Melting temperature (Tm) of "wild-type " and thermal tolerant single site amino acid mutant 
enzymes. 

The purified SEQ ID NO: 190 "wild-type" xylanase and the nine thermal 
tolerant single site amino acid mutants were analyzed using differential scanning calorimetry 
(DSC). Aggregation was apparent for the wild-type enzyme as evidenced by a shoulder in 
the DSC trace for its thermal denaturation, see Figure 6B. The evolved mutant enzymes 
showed no indication of aggregation. For all enzymes, thermally induced denaturation was 
irreversible and no discernible transition was observed in a second scan of the sample. Due 
to the irreversibility of denaturation, only the apparent Tm (melting temperature) could be 
calculated (as described, e.g., by Sanchez-Ruiz (1992) Biophys. J. 61:921-935; Beldarrain 
(2000) Biotechnol. Appl. Biochem. 31:77-84). The Tmof the wild-type enzyme was 61oC 
while the Tm's of all point mutants were increased and ranged from 64°C to 70"C (Figure 6A). 
The Q11H mutation introduced the largest increase (Tm= 70°C) over wild-type followed by 
P64V (69°C), G17I (67°C) and D8F (67°C). 

The "9X" combined GSSM™ exemplary enzyme SEQ ID NO: 378 

The "9X" enzyme (SEQ ID NO:378) was constructed by combining the 
single-site changes of the nine thermal tolerant up-mutants by site-directed mutagenesis 
(Figure 6A). The "9X" (SEQ ID NO:378) enzyme was expressed in.£ coli and purified to 
homogeneity. DSC was performed to determine the melting temperature. TheTmof"9X" 
enzyme was 34 degrees higher than SEQ ID NO: 190, the "wild-type" enzyme, demonstrating 
a dramatic shift in its thermal stability (Figure 6B). 

To evaluate the effect of the combined mutations and elevated melting 
temperature on the enzyme's biochemical properties, pH and temperature profiles were 
constructed and compared to SEQ ID NO: 190, the "wild-type" enzyme. Figure 7 illustrates 
the biochemical characterization of "wild type" and "evolved" 9X mutant enzymes. Figure 
7 A illustrates the pH-dependence of activity for the wild-type and evolved 9X mutant 
enzymes. Xylanase activity was measured at 37°C at each pH and the initial velocity was 
plotted against absorbance at 590 nm to determine initial rates. Figure 7B illustrates the 
temperature-dependence of activity for the wild-type and evolved 9X mutant enzymes. The 
optimum temperatures of the wild-type and 9X mutant enzymes were measured over a 
temperature range of 25-100°C at pH 6.0 and are based on initial rates measured over 5 
minutes. Figure 7C illustrates the thermal stability of wild-type and evolved 9X mutant 
enzymes. Thermal dependence of activity of the wild-type and evolved 9X mutant enzymes 
was measured by first heating the enzymes at each of the indicated temperatures for 5 
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minutes followed by cooling to room temperature and the measurement of residual activity 
(initial rate at 37°C, pH 6.0). For all experiments initial rates were measured 2 or more times 
and the variation was less than 10%. 

SEQ ID NO: 190 and SEQ ID NO:378 (the "9X" mutant) enzyme had 
5 comparable pH/activity profiles with the highest activity between pH 5 and 6 (Figure 7A). 
Both enzymes had similar initial rate/temperature optima at 70°C, however, SEQ ID NO: 190, 
the "wild-type" enzyme had higjier activity at lower temperatures (25-50°C) whereas SEQ ID 
NO:378 (the "9X" mutant) retained more than 60% of its activity up to 100°C (determined by 
initial rate) in the presence of substrate (Figure 7B). The activity of SEQ ID NO: 1 90, the 

10 '*wild-type" enzyme was not detectable at temperatures above 70°C. 

To determine the effect of the 9 combined mutations on enzyme thermal 
tolerance, residual activity was measured and compared to SEQ ID NO: 1 90, the <4 wild-type" 
enzyme. Residual activity was determined by a heat challenge for 5 minutes at each 
temperature (37, 50, 60, 70, 80 and 90°C) followed by activity measurements at 37°C. SEQ 

15 ID NO: 190 was completely inactivated above 70°C while the evolved 9X mutant displayed 
significant activity after heating at 70, 80 and even 90°C (Figure 7C). Furthermore, although 
the activity of the wild-type enzyme decreased with increasing temperature, the 9X variant 
was somewhat activated by heating at temperatures up to 60°C. 

Generation of combinatorial GSSM™ variants using GeneReassembly™ technology. 

20 To identify combinatorial variants of the 9 single site amino acid mutants with 

highest thermal tolerance and activity compared to the additively constructed SEQ ID 
NO:378 (the "9X" variant), a GeneReassembly™ library (U.S. Patent No. 6,537,776) of all 
possible mutant combinations (2 9 ) was constructed and screened. Using thermal tolerance as 
the screening criterion, 33 unique combinations of the nine mutations were identified as was 

25 the original 9X variant. A secondary screen was performed to select for variants with higher 
activity/expression than the evolved 9X. This screen yielded 10 variants with sequences 
possessing between 6 and 8 of the original single mutations in various combinations, as 
illustrated in Figure 8A. Figure 8 illustrates the combinatorial variants identified using 
GeneReassembly™ technology. Figure 8 A illustrates the GeneReassembly™ library of all 

30 possible combinations of the 9 GSSM™ point mutations that was constructed and screened 
for variants with improved thermal tolerance and activity. Eleven variants including the 9X 
variant were obtained. As shown in the figure, the variants possessed 6, 7, 8, or 9 of the point 
mutations in various combinations. The corresponding melting temperature transition 
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midpoint (Tm) determined by DSC of each variant is shown on the right. Figure 8B 
illustrates the relative activity (initial rate measured over a 5 minute time period) of the 6X-2 
and 9X variants compared to wild-type at the temperature optimum (70°C) and pH 6.0. Error 
bars show the range in the initial rate for 3 measurements. 
5 The melting temperature (Tm) of each of the combinatorial variants was at 

least 28°C higher than wild type (Figure 8A) and all of the reassembly variants displayed 
higher relative activity than the 9X enzyme. The activity of one variant in particular, 6X-2, 
was greater than the wild-type enzyme and significantly better (1.7X) than the 9X enzyme 
(Figure 8B). Sequence comparison of the reassembly variants identified at least 6 mutations 
10 that were required for the enhanced thermostability (>20 degrees). All 33 unique variants 
found in the initial thermostability screen contained both Ql 1H and G17I mutations 
demonstrating their importance for thermal tolerance. 

Analysis of wild-type and variant polysaccharide product fingerprints. 

The products generated by the "wild-type," 6X-2 and 9X variants were 

15 compared by polysaccharide analysis using carbohydrate gel electrophoresis (PACE). 

Different substrates (oligosaccharides and polysaccharides) were tested for hydrolysis by the 
xylanases. The digestion products of the 3 xylanases tested were very similar, as illustrated 
in Figure 9. All three enzymes hydrolyzed (Xyl)6 and (Xyl)5, mainly into both (Xyl)3 and 
(Xyl)2, and (Xyl> was hydrolyzed to (Xyl)2 (Figure 9A). Only a small amount of hydrolysis 

20 of (Xyl)3 into (Xyl)2 and Xyl was observed indicating that (Xyl)3 is a relatively poor substrate 
for the enzyme. No activity was detected on (Xyl) 2 . Beechwood xylan, which contains 
glucuronosyl residues, was hydrolyzed by all three enzymes mainly into (Xyl)2 and (Xyl>, 
but other bands were detected that migrated between oligoxylan bands (Figure 9B). In PACE 
analysis, each oligosaccharide has a specific migration depending on the sugar composition 

25 and degree of polymerization (Goubet, F., Jackson, P., Deery, M. and Dupree, P. (2002) 
Anal Biochem. 300, 53-68), thus, these bands likely correspond to oligoglucuronoxylans. 
Therefore, the evolved enzymes retained the substrate specificity of the "wild-type" enzyme. 

As noted above, Figure 9 illustrates the product fingerprints of "wild-type" 
SEQ ID NO:190 (encoded by SEQ ID NO:189), 6X-2 (SEQ ID NO:380, encoded by SEQ ID 

30 NO:379) and SEQ ID NO:378 (the "9X" mutant) enzyme variant, as determined by PACE. 
Figure 9A illustrates fingerprints obtained after hydrolysis of oligoxylans (Xyi)3, (Xyl)4, 
(Xyl)5 and (Xyl)6 by "wild-type" and variant enzymes. Control lanes contain oligosaccharide 
incubated under the assay conditions in the absence of enzyme. Figure 9B illustrates the 
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fingerprints obtained after hydrolysis of Beechwood xylan by wild-type and variant enzymes. 
Standards contained (Xyl)2, (Xyl>, (Xyl>*. All assays were performed at 37°C and pH 5.5. 

A combination of laboratory gene evolution strategies was used to rapidly 
generate a highly active, thermostable xylanase optimized for process compatibility in a 
5 number of industrial market applications. GSSM™ methodology was employed to scan the 
entire sequence of the exemplary "wild type" xylanase SEQ ID NO: 1 90 (encoded by SEQ ID 
NO: 189) and to identify 9 point mutations that improve its thermal tolerance. Although it 
had no discernable effect on the hydrolysis product profile of the enzyme, as illustrated in 
Figure 9, the addition of the 9 mutations to the protein sequence resulted in a moderate 

10 reduction in enzymatic specific activity at SEQ ID NO:190 (the "wild-type'ys temperature 
optimum. 70°C, see Figure 9B. Using the GeneReassembly™ method to generate a 
combinatorial library of the 9 single site amino acid mutants, this reduction in activity was 
overcome. Ten thermostable variants (Tm's between 89°C and 94°C) with activity better than 
the "9X" variant were obtained from screening the GeneReassembly™ library. With a Tm of 

15 90°C, enzymatic specific activity surpassing wild-type and a product fingerprint unaltered 
and comparable to SEQ ID NO:190 (the 'Vild-type"), the 6X-2 variant (SEQ ID NO:380, 
encoded by SEQ ID NO:379) is particularly notable. To our knowledge the shift in Tm 
obtained for these variants is the highest increase reported from the application of directed 
evolution technologies. 

20 SEQ ID NO:380 (the 6X-2 variant) includes the following changes, as 

compared to SEQ ID NO:190 (the "wild-type"): D8F, Ql 1H, G17I, G60H, S65V and G68A. 
SEQ ID NO:379 includes the following nucleotide changes, as compared to the "wild type" 
SEQ ID NO:189: the nucleotides at positions 22 to 24 are TTC, the nucleotides at positions 
31 to 33 are CAC, the nucleotides at positions 49 to 51 are ATA, the nucleotides at positions 

25 178 to 180 are CAC, the nucleotides at positions 193 to 195 are GTG, the nucleotides at 
positions 202 to 204 are GCT. 

In order to gauge the effectiveness of combinatorial mixing versus addition of 
the point mutants to the desired phenotype, a fitness parameter combining contributions both 
from changes in enzyme activity and thermostability was calculated for each mutant. The 

30 term fitness as described here is not an objective measure that can be compared to other 
enzymes, but rather a term that allows the measurement of the success of directed evolution 
of this particular xylanase. Since enzyme fitness, F, is calculated by equally weighting 
changes in Tm and enzyme activity for this set of variants, the maximum allowable fitness 
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value is 2 (Ft < 1 and Fv <1 , see above). In other words, if the variant with the best activity 
also had the highest Tm, its fitness value would be 2. With a fitness value near 2 (see Fig. 
10B), the 6X-2 variant (SEQ ID NO:380, encoded by SEQ ID NO:379) is the closest to 
possessing the best possible combination of thermal stability and enzyme activity. The single 
5 site mutation that confers the highest value of fitness is S65 V. Although the Tm of the S65 V 
mutant is lower than that of the Ql 1H mutant (66°C verses 70°C respectively), it has a higher 
fitness value since its specific activity is not reduced relative to wild-type. 

Figure 1 OA is a schematic diagram illustrating the level of thermal stability 
(represented by Tm) improvement over "wild-type" obtained by GSSM™ evolution. The 

10 single site amino acid mutant and the combinatorial variant with the highest thermal stability 
(Q11H and "9X" (SEQ ID NO:378), respectively) are shown in comparison to wild-type. 
Figure 10B illustrates a "fitness diagram" of enzyme improvement obtained by combining 
GSSM™ and GeneReassembly™ technologies. Fitness was determined using the formula F 
= FT + FV where fitness (F) is calculated by equally weighting thermal tolerance fitness (FT) 

15 and relative activity fitness (FV) as described above. The point mutation that confers the 
greatest fitness (S65V) is shown. Combining all 9 point mutations also improved fitness 
(SEQ ID NO:378, the "9X" variant). However, the largest improvement in fitness was 
obtained by combining GSSM™ and GeneReassembly™ methods to obtain the best variant, 
6X-2 (SEQIDNO:380). 

20 The GeneReassembly™ method also allowed the identification of important 

residues that appear absolutely necessary for improved thermal stability. Two key residues, 
Ql 1H and G17I, were present in every GeneReassembly™ variant identified based on 
thermal tolerance (see Figure 6A). The structural determinants for thermal stability of 
proteins have been studied and several theories have been documented, e.g., by Kinjo (2001) 

25 Eur. Biophys. J. 30:378-384; Britton (1999) J. Mol. Biol. 293:1 121-1 132; Ladenstein (1998) 
Adv. Biochem. Eng. Biotechnol. 61:37-85; Britton (1995) Eur. J. Biochem. 229:688-695; 
Tanner (1996) Biochemistry 35:2597-2609; Vetriani (1998) Proc. Natl. Acad. Sci. USA 
95:2300-2305. Hydrogen bonding patterns, ionic interactions, hydrophobic packing and 
decreased length of surface loops are among the key factors even though the contribution of 

30 each to protein stability is not fully understood. Given that most of the beneficial point 

substitutions identified from testing all possible single amino acid substitutions involved the 
replacement of relatively polar, charged or small (glycine) residues for much larger 
hydrophobic residues, it can surmised that hydrophobic interactions play the most significant 
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role in enhancing the thermostability of this protein. Even with a good understanding of the 
optimal interactions to enhance thermal tolerance, the prediction of where to make mutations 
that introduce such interactions is not straightforward, A nonrational approach using the 
GSSM™ method, however, allows rapid sampling of all sidechains at all positions within a 
5 protein structure. Such an approach leads to the discovery of amino acid substitutions that 
introduce functional interactions that could not have been foreseen. 

EXAMPLE 6: Pre-treating paper pulp with xvlanases of the invention 

In one aspect, xylanases of the invention can be used to pretreat paper pulp. 
This example describes an exemplary routine screening protocol to determine whether a 

10 xylanase is useful in pretreating paper pulp; e.g., in reducing the use of bleaching chemicals 
(e.g., chlorine dioxide, CIO2) when used to pretreat Kraft paper pulp. 

The screening protocol has two alternative test parameters: Impact of xylanase 
treatment after an oxygen delignification step (post-Q^pulp); and, Impact of xylanase in a 
process that does not include oxygen delignification (pre-Ch brownstock). 

1 5 For pulp treatment conditions that simulate process conditions in industrial 

situations, e.g., factories: pH 8.0; 70 °C; 60 min duration. 

The process is schematically depicted in the Flow Diagram of Figure 1 1 . 
Twenty xylanases were identified by biochemical tests that were active under 
these conditions. Of the 20 xylanases, 6 were able to significantly reduce C10 2 demand when 

20 they were used to pretreat Kraft pulp before it was chemically bleached. The six are: SEQ ID 
NO:182 (encoded by SEQ ID NO:181); SEQ ID NO:160 (encoded by SEQ ID NO:159); 
SEQ ID NO:198 (encoded by SEQ ID NO:197); SEQ ID NO:168 (encoded by SEQ ID 
NO:167); SEQ ID NO:216 (encoded by SEQ ID NO:215); SEQ ID NO:260 (encoded by 
SEQ ID NO:259). Others showed some activity but were not as good. Xylanases SEQ ID 

25 NO:182 (encoded by SEQ ID NO:181) and SEQ ID NO:160 (encoded by SEQ ID NO:159) 
are modular and contain a carbohydrate binding module in addition to the xylanase catalytic 
domain. It was demonstrated that truncated derivatives of these 2 xylanases containing just 
the catalytic domain are more effective in this application. The best xylanase, SEQ ID 
NO:160 (encoded by SEQ ID NO:159) was studied more comprehensively. Results can be 

30 summarized as follows: 

- pretreatment of post-02 spruce/pine/fir (SPF) pulp with 2 units/g of SEQ ID 
NO:160 (encoded by SEQ ID NO:159) reduces subsequent C10 2 use by 22% to reach 65%GE 
brightness; 
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- pretreatment of pre-0 2 brownstock SPF with 0.5 units/g SEQ ID NO: 160 



(encoded by SEQ ID NO:159) reduces subsequent C10 2 use by 13% to reach 65%GE 
brightness; 

- pretreatment of pre-0 2 Aspen pulp with 0.5 units/g SEQ ID NO: 160 
5 (encoded by SEQ ID NO: 1 59) reduces C10 2 use by at least 22%; 

- pretreatment of pre-0 2 Douglas Fir/Hemlock pulp with 0.5 units/g SEQ ED 
NO:160 (encoded by SEQ ID NO:159) reduces C10 2 use by at least 22%; 

- under the treatment conditions employed, the reduction in yield from the 
xylanase treatment did not exceed 0.5% when compared with pulp that had been bleached at 

10 the same kappa factor but not treated with xylanase; 

- optimal conditions for treating post-C^ SPF pulp with SEQ ID NOS:159, 160 
were: pH 6-7, enzyme dose 0.3 units/g, treatment time 20-25 min. Under these conditions, 
reduction in C10 2 use of 28% was possible to reach 69%GE brightness. 
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In further experiments: 

SEQ ID NO:160 (XYLA), encoded by SEQ ID NO: 159 - full length wild 



type xylanase: 

• XYLA (E.c) » truncated variant of SEQ ID NOS: 1 59, 1 60 containing only xylanase 



catalytic domain expressed in E.coli 
• XYLA (P.f) = ditto but expressed in P. fluorescens 
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SEQ ID NO: 182 (encoded by SEQ ID NO: 1 81) - second full-length wild 



type xylanase: 

• XYLB (E.c) = truncated variant etc, etc expressed in E.coli 

• XYLB (P.f) = ditto but expressed in P. fluorescens 



Dose Response Data for Lead Xvlanases on Pre-02 Brownstock 



25 



30 



Conditions for xylanase stage (X-stage) as follows: 
pH8 

Temperature 70°C 
Time 60 min 
Kappa factor 0.24 

For no-enzyme control, kappa factor was 0.30 



Results showed a dose dependent increase in brightness for xylanase-treated 



samples at a lower charge of chlorine dioxide (CIO2) (Kf 0.24 vs Kf 0.30). 
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In each case, the truncated derivative looked to be more effective that the full- 
length xylanase. Optimal xylanase dose looked to be around 0.6 to 0.7 U/g pulp. 

Pretreatment of Intercontinental Pre-O z Brownstock with the best 4 Xvlanases 
Determination of C10 2 Dose Response in D 0 
Experimental outline 

• Pre-C>2 Brownstock 

o Initial kappa 31.5 

• X stage conditions 

o Xylanase charge 0.7 U/gm 

o Temperature 70°C 

o pH8 

o Treatment time 1 hr 

o Pulp consistency 10% 

• Bleach sequence XDEp 

o Kappa factor 0.22, 0.26 and 0.30 (%D on pulp: 2.63, 3.12 and 3.60) 

Final brightness after 3-staee bleach sequence versus Kappa fa ctor (CIO? charge): 

• XYLB - At 61.5 final brightness, X-stage enables reduction in C10 2 use of 3.89 
kg/ton pulp. 

• XYLB (Ex) - At 61 .5 final brightness, X-stage enables reduction in C10 2 charge of 
4.07 kg/ton pulp. 

• XYLA - At 61.5 brightness, X-stage enables a reduction in C10 2 use of 4.07 kg/ton 
pulp. 

• XYLA (E.c) - At 6 1 .5 final brightness, X-stage enables reduction in C10 2 use of 4.90 
kg/ton pulp. 

Determination of CIO? Dose Response in D 0 : 
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CIO2 Savings 
in D 0 
(kg/ton OD) 


Kf reduction in 
Do 


XYLB 


3.89 


1 1 .7% 


XYLB (E.c) 


5.08 


15.8% 


XYLA 


4.07 


12.2% 


XYLA (E.c) 


4.90 


14.7% 



Xylanase 0.7 U/g, pH 8.0, 70 °C, 1 hr 
Pulp: Pre-02 Brownstock, initial kappa 31.5 

Percentage saving of C10 2 is of little significance to the industry. Their 
5 primary concern is lbs of C10 2 required per ton OD pulp. This makes sense when one 
considers that a lower percentage saving seen with a high initial kappa brownstock can be 
more valuable in terms of lbs of C10 2 saved than a higher percentage reduction for a low 
initial kappa pulp which will require a lower total charge of CIO2 to reach target brightness. 

Relationship between Brightness. Yield and Kappa Factor for Bleached Co ntrol Pulp: 
10 The results showed that bleaching with increasing doses of CIO2 to achieve 

higher target brightness results in increased loss of pulp yield. This is an issue because pulp at 3 

this stage of the process has a value of almost $400 per ton and loss of cellulose costs money. 

A benefit of xylanase (e.g., a xylanase of the invention) is that use of a lower 

CIO2 dose can reduce yield losses as long as the action of the xylanase itself doesn't cancel 
15 out the gain. 

Dose Response Data for Pretreatment of Pre-O? Brownstock with Xylanase XYLB (P.fl: 
Experimental outline 

• Northwood Pre-02 Brownstock 

-Initial kappa 28.0 
20 -Initial consistency 32.46% 

-Initial brightness 28.37 

• X stage conditions 
-Xylanase charge 0 to 2.70 U/gm 
-Temperature 58QC to 610C 
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8.2 to 8.5 
-Treatment time lhr 
•Bleach sequence XDEp 
-Kappa factor 0.24 

5 -C102 saving calculated for Kappa factors between 0.24 and 0.30 

The purpose of this experiment was to evaluate the best of the 4 xylanases on 
unwashed SPF brownstock. Results showed dose-dependent increases in final brightness for 
pulp treated with XYLB (E.c), with brightness achieved in presence of xylanase at lower Kf 
of 0.24, approaching brightness achieved at higher Kf of 0.30 asymptotically. 

10 Relationship between Dose of Xvlanase XYLB (E.c) and Chlorine Dioxide Saving (Pre-O? 
Brownstock): 



CIO z Saving in % OD 
Pulp 


CI0 2 Saving in kg/ton 
Pulp 


Xylanase Dose In U/gm 


0.299% 


2.99 


0.31 


0.363% 


3.63 


0.51 


0.406% 


4.06 


0.71 


0.439% 


4.39 


0.91 


0.483% 


4.83 


1.26 


0.523% 


5.32 


1.80 


0.687% 


5.87 


2.70 



Optimum Xylanase Dose is between 0.5 and 0.9 U/gm 
The optimum dose lies in the range 0.5 to 0.9 U/g. Above this dose there is a 
15 diminishing return per unit increment of xylanase. Reductions in chlorine dioxide dose per 
ton of pulp treated of this magnitude are commercially significant. 

Three-stage biobleaching procedure 

A three-stage biobleaching procedure was developed that would closely 
simulate the actual bleaching operations in a pulp mill bleach plant (Fig. 1). This bleach 
20 sequence is designated by (X)DoEp, in which X represents the xylanase treatment stage, D 
for chlorine dioxide bleaching stage, and Ee for alkaline peroxide extraction stage. The 
primary feedstock used in our application tests was Southern Softwood Kraft Brownstock 
(without oxygen delignification). The most effective xylanase candidates that showed high 
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bleach chemical reduction potential in the biobleaching assays were also tested on two 
species of hardwood Kraft pulp (maple and aspen). Upon completion of each biobleaching 
round, the ensuing pulp was used to produce TAPPI (Technical Association of Pulp and 
Paper Industries)-standard handsheets. The GE% brightness of each handsheet was 
measured, and the brightness values were used as the indication of how well each enzyme 
had performed on the pulp during the enzymatic pretreatment stage (X). 
Results : 

Out of approximately 1 10 xylanases that were screened using the (X)DoEp 
biobleaching sequence, 4 enzymes, i.e., XYLA (P.f); XYLB (P.f); SEQ ID N0216 (encoded 
by SEQ ID NO:215); SEQ ID NO:176 (encoded by SEQ ID NO: 175); showed the greatest 
potential for reducing the use of bleaching chemicals. While XYLA (P.f) and XYLB (P.f) 
exhibited equally high performance (best among the four good performers), XYLA (P.f) 
showed a better pH tolerance than XYLB (Pi). The results can be summarized as follows: 

• It is possible to achieve a handsheet brightness of 60 (GE%) using a three-stage bleach 
sequence [(X)DoEp] that involves pretreatment of Southern Softwood Kraft Brownstock 
with the following four enzymes at the loading levels listed below (pH=8, 65 °C & 1 h): 

o XYLA (P.f) at 0.55 U/g pulp 

o XYLB (P.f) at 0.75 U/g pulp 

o SEQIDNOS:215, 216 at 1.80 U/g pulp 

o SEQ ID NOS:175, 176 at 1.98 U/g pulp 

• Pretreatment of Southern Softwood Kraft Brownstock with 2 U/g pulp of XYLA (P.f) 
reduces CIO2 use by 18.7% to reach a final GE% brightness of 61. 

• XYLA (P.f) exhibits good tolerance at higher pH and provides more than 14% chemical 
savings when the enzymatic pretreatment stage is run at pH=10. 

• Pretreatment of Southern Softwood Kraft Brownstock with 2 U/g pulp of XYLB (P.f) 
reduces C10 2 use by 16.3% to reach a final GE% brightness of 60.5. 

• Pretreatment of aspen Kraft pulp with 2 U/g pulp of XYLA (P.f) and XYLB (P.f) 
reduces CIO2 use by about 35% to reach a final GE% brightness of 77. 

• Pretreatment of maple Kraft pulp with 2 U/g pulp of XYLA (P.f) and XYLB (P.f) 
reduces C10 2 use by about 38% to reach a final GE% brightness of 79. 
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• The two best performing xylanases, namely XYLA (P.f) and XYLB (P.f), are truncated 
enzymes, containing just the catalytic domain, and were produced in Pseudomonas 
fluorescens. 

5 While the invention has been described in detail with reference to certain 

preferred aspects thereof, it will be understood that modifications and variations are within 
the spirit and scope of that which is described and claimed. 
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WHAT IS CLAIMED IS : 

1 . An isolated or recombinant nucleic acid comprising a nucleic acid 

5 sequence having at least 50% sequence identity to SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13, SEQ ID NO:15, SEQ 
ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 

10 SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, 
SEQ ID NO:73, SEQ ID NO:75, SEQ ED NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ID 
NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91, SEQ ID NO:93, 
SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID 

15 NO:105, SEQ ID NO:107, SEQ ID NO:109, SEQ ID NO:l 1 1, SEQ ID NO:113, SEQ ID 
NO:l 15, SEQ ID NO:117, SEQ ID NO:l 19, SEQ ID NO:121, SEQ ID NO:123, SEQ ID 
NO:125, SEQ ID NO:127, SEQ ID NO:129, SEQ ID NO:131, SEQ ID NO:133, SEQ ID 
NO:135, SEQ ID NO:137, SEQ ID NO:139, SEQ ID NO:141, SEQ ID NO:143, SEQ ID 
NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ID NO:151, SEQ ID NO:153, SEQ ID 

20 NO:155, SEQ IDNO:157, SEQ ID NO:199, SEQ ID NO:161, SEQ ID NO:163, SEQ ID 
NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID NO:171, SEQ ID NO:173, SEQ ID 
NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID NO:181, SEQ ID NO:183, SEQ ID 
NO:185, SEQ ID NO:187, SEQ ID NO:189, SEQ ID NO:191, SEQ ID NO:193, SEQ ID 
NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID NO:201, SEQ ID NO:203, SEQ ID 

25 NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID NO:21 1, SEQ ID NO:213, SEQ ID 
NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ID NO:221, SEQ ID NO:223, SEQ ID 
NO:225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID NO:231, SEQ ID NO:233, SEQ ED 
NO:235, SEQ ID NO:237, SEQ ID NO:239, SEQ ID NO:241, SEQ ID NO:243, SEQ ID 
NO:245, SEQ ID NO:247, SEQ ID NO:249, SEQ ID NO:251, SEQ ID NO:253, SEQ ID 

30 NO:255, SEQ ID NO:257, SEQ ED NO:259, SEQ ID NO:261, SEQ ID NO:263, SEQ ID 
NO:265, SEQ ID NO:267, SEQ ED NO:269, SEQ ID NO:271, SEQ ID NO:273, SEQ ID 
NO:275, SEQ ID NO:277, SEQ ID NO:279, SEQ ED NO:281, SEQ ID NO:283, SEQ ID 
NO:285, SEQ ID NO:287, SEQ ID NO:289, SEQ ID NO:29l, SEQ ED NO:293, SEQ ED 
NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ED NO:301, SEQ ED NO:303, SEQ ID 
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NO:305, SEQ ID NO:307, SEQ ID NO:309, SEQ ID N0:31 1, SEQ ID NO:313, SEQ ID 
NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ID NO:321, SEQ ID NO:323, SEQ ID 
NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ID NO:331, SEQ ID NO:333, SEQ ID 
NO:335, SEQ ID NO:337, SEQ ID NO:339, SEQ ID NO:341, SEQ ID NO:343, SEQ ID 
5 NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ID NO:351, SEQ ID NO:353, SEQ ID 
NO:355, SEQ ID NO:357, SEQ ID NO:359, SEQ ID NO:361, SEQ ID NO:363, SEQ ID 
NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ID NO:371, SEQ ID NO:373, SEQ ID 
NO:375, SEQ ID NO:377 or SEQ ID NO:379, over a region of at least about 100 residues, 
wherein the nucleic acid encodes at least one polypeptide having a xylanase activity, and the 
10 sequence identities are determined by analysis with a sequence comparison algorithm or by a 
visual inspection. 

2. The isolated or recombinant nucleic acid of claim 1 , wherein the 
sequence identity is at least about 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 

15 61%, 62%, 63% or 64%. 

3. The isolated or recombinant nucleic acid of claim 1, wherein the 
sequence identity is at least about 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 
75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 

20 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more sequence identity to SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID 
NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID 
NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, 

25 SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID 
NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, 
SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, SEQ ID 
NO:79, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, 
SEQ ID NO:91, SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID 

30 NO:101, SEQ ID NO:103, SEQ ID NO:105, SEQ ID NO:107, SEQ ID NO:109, SEQ ID 
NO.l 1 1, SEQ ID NO: 1 1 3, SEQ ID NO: 115, SEQ ID NO:l 17, SEQ ID NO:l 19, SEQ ID 
NO:121, SEQ ID NO:123, SEQ ID NO:125, SEQ ID NO:127, SEQ ID NO:129, SEQ ID 
NO:131, SEQ ED NO:133, SEQ ID NO:135, SEQ ID NO:137, SEQ ID NO:139, SEQ ID 
NO:141, SEQ ID NO:143, SEQ ID NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ID 
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NO:151, SEQ ID NO:153, SEQ ID NO:155, SEQ ED NO:157, SEQ ID NO:199, SEQ ID 
N0.161, SEQ ID NO:163, SEQ ID NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID 
N0:171, SEQ ED NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID 
N0:181, SEQ ID NO:183, SEQ ID NO:185, SEQ ID NO:187, SEQ ID NO:189, SEQ ID 
5 N0:191 , SEQ ID NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID 
NO:201, SEQ ID NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID 
N0:211, SEQ ID N0:213, SEQ ID NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ID 
NO:221, SEQ ID NO:223, SEQ ID NO:225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID 
NO:231, SEQ ID NO:233, SEQ ID NO:235, SEQ ED NO:237, SEQ ID NO:239, SEQ ID 

10 NO:241, SEQ ED NO:243, SEQ ID NO:245, SEQ ED NO:247, SEQ ID NO:249, SEQ ID 
NO:251, SEQ ED NO:253, SEQ ED NO:255, SEQ ID NO:257, SEQ ED NO:259, SEQ ID 
NO:261, SEQ ID NO:263, SEQ ID NO:265, SEQ ED NO:267, SEQ ID NO:269, SEQ ED 
NO:271, SEQ ID NO:273, SEQ ID NO:275, SEQ ID NO:277, SEQ ED NO:279, SEQ ID 
NO:281, SEQ ID NO:283, SEQ ID NO:285, SEQ ED NO:287, SEQ ID NO:289, SEQ ED 

15 NO:291, SEQ ED NO:293, SEQ ID NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ED 
NO:301, SEQ ED NO:303, SEQ ID NO:305, SEQ ED NO:307, SEQ ED NO:309, SEQ ED 
N0:31 1, SEQ ID NO:313, SEQ ID NO:315, SEQ ED NO:317, SEQ ED NO:319, SEQ ID 
NO:321, SEQ ID NO:323, SEQ ID NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ID 
NO:331, SEQ ID NO:333, SEQ ID NO:335, SEQ ID NO:337, SEQ ID NO:339, SEQ ID 

20 NO:341, SEQ ED NO:343, SEQ ID NO:345, SEQ ED NO:347, SEQ ID NO:349, SEQ ID 
NO:351, SEQ ID NO:353, SEQ ID NO:355, SEQ ID NO:357, SEQ ID NO:359, SEQ ID 
NO:361, SEQ ED NO:363, SEQ ED NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ DD 
NO:371, SEQ ID NO:373, SEQ ED NO:375, SEQ ID NO:377 or SEQ ID NO:379. 

25 4. The isolated or recombinant nucleic acid of claim 1 , wherein the 

sequence identity is over a region of at least about 50, 75, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150 or more 
residues, or the full length of a gene or a transcript. 

30 5. The isolated or recombinant nucleic acid of claim 1, wherein the 

nucleic acid sequence comprises a sequence as set forth in SEQ ED NO:l, SEQ ED NO:3, 
SEQ ID NO:5, SEQ ID NO:7, SEQ ED NO:9, SEQ ED NO:ll, SEQ ED NO:13, SEQ ED 
NO:15, SEQ ED NO:17, SEQ ID NO:19, SEQ ID NO:2l, SEQ ID NO:23, SEQ ED NO:25, 
SEQ ID NO:27, SEQ ED NO:29, SEQ ID NO:31, SEQ ED NO:33, SEQ ID NO:35, SEQ ED 
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NO:37, SEQ ID NO:39, SEQ ID N0:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, 
SEQ ID NO:49, SEQ ID N0:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID 
NO:59, SEQ ID N0:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, 
SEQ ID N0:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID 
5 N0:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, SEQ ID N0:91, 
SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID 
NO:103, SEQ ID NO:105, SEQ ID NO:107, SEQ ID NO:109, SEQ ID N0:1 1 1, SEQ ID 
N0:113, SEQ ID N0:115, SEQ ID N0:117, SEQ ID N0:119, SEQ ID N0:121, SEQ ID 
NO:123, SEQIDNO:125, SEQIDNO:127, SEQIDNO:129, SEQIDN0:131, SEQ ID 
10 NO:133, SEQ ID NO:135, SEQ ID NO:137, SEQ ID NO:139, SEQ ID NO:141, SEQ ID 
NO:143, SEQ ID NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ID NO:151, SEQ ID 
NO:153, SEQ ID NO:155, SEQ ID NO:157, SEQ ED NO:199, SEQ ID N0:161, SEQ ID 
NO:163, SEQ ID NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID N0:171, SEQ ID 
NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID N0:181, SEQ ID 
15 N0.183, SEQ ID NO:185, SEQ ID NO:187, SEQ ID NO:189, SEQ ID N0:191, SEQ ID 
NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID NO:201, SEQ ID 
NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID N0:21 1, SEQ ID 
NO:213, SEQ ID NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ID NO:221, SEQ ID 
NO:223, SEQ ID NO:225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID NO:231, SEQ ID 
20 NO:233, SEQ ED NO:235, SEQ ID NO:237, SEQ ED NO:239, SEQ ED NO:241, SEQ ED 
NO:243, SEQ ID NO:245, SEQ ID NO:247, SEQ ED NO:249, SEQ ED NO:251, SEQ ED 
NO-.253, SEQ ED NO:255, SEQ ID NO:257, SEQ ED NO:259, SEQ ED NO:261, SEQ ED 
NO:263, SEQ ED NO:265, SEQ ID NO:267, SEQ ID NO:269, SEQ ED NO:271, SEQ ED 
NO:273, SEQ ED NO:275, SEQ ID NO:277, SEQ ED NO:279, SEQ ED NO:281, SEQ ED 
25 NO:283, SEQ ID NO:285, SEQ ID NO:287, SEQ ED NO:289, SEQ ED NO:291, SEQ ED 
NO:293, SEQ ED NO:295, SEQ ID NO:297, SEQ ED NO:299, SEQ ID NO:301, SEQ ID 
NO:303, SEQ ID NO:305, SEQ ID NO:307, SEQ ED NO:309, SEQ ED NO:311, SEQ ID 
NO:313, SEQ ED NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ED NO:321, SEQ ED 
NO:323, SEQ ID NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ED NO:331, SEQ ID 
30 NO:333, SEQ ED NO:335, SEQ ID NO:337, SEQ ID NO:339, SEQ ED NO:341, SEQ ID 
NO:343, SEQ ED NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ED NO:351, SEQ ED 
NO:353, SEQ ED NO:355, SEQ ID NO:357, SEQ ID NO:359, SEQ ED NO:361, SEQ ID 
NO:363, SEQ ED NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ED NO:371, SEQ ED 
NO:373, SEQ ED NO:375, SEQ ED NO:377 or SEQ ED NO:379. 
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6. Hie isolated or recombinant nucleic acid of claim I , wherein the 
nucleic acid sequence encodes a polypeptide having a sequence as set forth in SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID 

5 NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22 5 SEQ ID NO:24, 
SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID 
NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, 

10 SEQ ID NO:70, SEQ ID NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID 
NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, SEQ ID NO:88, SEQ ID NO:90, 
SEQ ID NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO:100, SEQ ID 
NO:102, SEQ ID NO:104, SEQ ID NO:106, SEQ ID NO.108, SEQ ID NO:110, SEQ ID 
NO:l 12, SEQ ID NO:l 14, SEQ ID NO:l 16, SEQ ID NO:l 18, SEQ ID NO: 120, SEQ ID 

15 NO:122, SEQ ID NO:124, SEQ ID NO:126, SEQ ID NO:128, SEQ ID NO:130, SEQ ID 
NO:132; SEQ ID NO:134; SEQ ID NO:136; SEQ ID NO:138; SEQ ID NO:140; SEQ ID 
NO:142; SEQ ID N0.144; NO:146, SEQ ID NO:148, SEQ ID NO:150 f SEQ ID NO:152, 
SEQ ID NO:154, SEQ ID NO:156, SEQ ID NO:158, SEQ ID NO:160, SEQ ID NO:162, 
SEQ ID NO:164, SEQ ID NO:166, SEQ ID NO:168, SEQ ID NO:170, SEQ ID NO:172, 

20 SEQ ID NO:174, SEQ ID NO:176, SEQ ID NO:178, SEQ ID NO:180, SEQ ID NO:182, 
SEQ ID NO:184, SEQ ID NO:186, SEQ ID NO:188, SEQ ID NO:190, SEQ ID NO:192, 
SEQE)NO:194, SEQIDNO:196, SEQIDNO:198, SEQIDNO:200, SEQIDNO:202, 
SEQ ID NO:204, SEQ ID NO:206, SEQ ID NO:208, SEQ ID NO:210, SEQ ID NO:212, 
SEQ ID NO:214, SEQ ID NO:216, SEQ ID NO:218, SEQ ID NO.220, SEQ ID NO:222, 

25 SEQ ID NO:224, SEQ ID NO:226, SEQ ID NO:228, SEQ ID NO:230, SEQ ID NO:232, 
SEQ ID NO:234, SEQ ID NO:236, SEQ ID NO:238, SEQ ID NO:240, SEQ ID NO:242, 
SEQ ID NO:244, SEQ ED NO:246, SEQ ID NO:248, SEQ ID NO:250, SEQ ID NO:252, 
SEQ ID NO:254, SEQ ID NO:256, SEQ ID NO:258, SEQ ID NO:260, SEQ ID NO:262, 
SEQ ID NO:264, SEQ ED NO:266, SEQ ID NO:268, SEQ ID NO:270, SEQ ID NO:272, 

30 SEQ ED NO:274, SEQ ID NO:276, SEQ ID NO:278, SEQ ED NO:280, SEQ ID NO:282, 
SEQ ED NO:284, SEQ ID NO:286, SEQ ED NO:288, SEQ ID NO:290, SEQ ED NO:292, 
SEQ ID NO:294, SEQ ED NO:296, SEQ ED NO:298, SEQ ID NO:300, SEQ ED NO:302, 
SEQ ED NO:304, SEQ DD NO:306, SEQ DD NO:308, SEQ ID NO:3 10, SEQ ED NO:3 12, 
SEQ DD NO:314, SEQ ED NO:316, SEQ DD NO:318, SEQ DD NO:320, SEQ ID NO:322, 
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SEQ ID NO:324, SEQ ID NO:326, SEQ ID NO:328, SEQ ID NO:330, SEQ ID NO:332, 
SEQ ID NO:334, SEQ ID NO:336, SEQ ID NO:338, SEQ ID NO:340, SEQ ID NO:342, 
SEQ ID NO:344, SEQ ID NO:346, SEQ ID NO:348, SEQ ID NO:350, SEQ ID NO:352, 
SEQ ID NO:354, SEQ ID NO:356, SEQ ID NO:358, SEQ ID NO:360, SEQ ID NO:362, 
5 SEQ ID NO:364, SEQ ID NO:366, SEQ ID NO:368, SEQ ID NO:370, SEQ ID NO:372, 
SEQ ID NO:374, SEQ ID NO:376, SEQ ID NO:378 or SEQ ID NO:380. 

7. The isolated or recombinant nucleic acid of claim 1 , wherein the 
sequence comparison algorithm is a BLAST version 2.2.2 algorithm where a filtering setting 

10 is set to blastall -p blastp -d "nr pataa" -F F, and all other options are set to default. 

8. The isolated or recombinant nucleic acid of claim 1 , wherein the 
xylanase activity comprises catalyzing hydrolysis of internal |3-l,4-xylosidic linkages. 

15 9. The isolated or recombinant nucleic acid of claim 8, wherein the 

xylanase activity comprises an endo-l,4-beta-xylanase activity. 

10. The isolated or recombinant nucleic acid of claim 1, wherein the 
xylanase activity comprises hydrolyzing a xylan to produce a smaller molecular weight 

20 xylose and xylo-oligomer. 

1 1 . The isolated or recombinant nucleic acid of claim 10, wherein the 
xylan comprises an arabinoxyian. 

25 12. The isolated or recombinant nucleic acid of claim 1 1 , wherein the 

arabinoxyian comprises a water soluble arabinoxyian. 

13. The isolated or recombinant nucleic acid of claim 12, wherein the 
water soluble arabinoxyian comprises a dough or a bread product. 



30 



14. The isolated or recombinant nucleic acid of claim 1, wherein the 
xylanase activity comprises hydrolyzing polysaccharides comprising l,4-(3-glycoside-linked 
D-xylopyranoses. 
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15. The isolated or recombinant nucleic acid of claim 1 , wherein the 
xylanase activity comprises hydrolyzing hemicelluloses. 



1 6. The isolated or recombinant nucleic acid of claim 1 5, wherein the 

5 xylanase activity comprises hydrolyzing hemicelluloses in a wood or paper pulp or a paper 
product. 

17. The isolated or recombinant nucleic acid of claim 8, wherein the 
xylanase activity comprises catalyzing hydrolysis of xylans in a feed or a food product. 

10 

18. The isolated or recombinant nucleic acid of claim 17, wherein the feed 
or food product comprises a cereal-based animal feed, a wort or a beer, a milk or a milk 
product, a fruit or a vegetable. 

15 19. The isolated or recombinant nucleic acid of claim 1, wherein the 

xylanase activity comprises catalyzing hydrolysis of xylans in a microbial cell or a plant cell. 

20. The isolated or recombinant nucleic acid of claim 1 , wherein the 
xylanase activity is thermostable. 

20 

21 . The isolated or recombinant nucleic acid of claim 20, wherein the 
polypeptide retains a xylanase activity under conditions comprising a temperature range of 
between about 37°C to about 95°C, or between about 55°C to about 85°C, or between about 
70°C to about 75°C, or between about 70°C to about 95°C, or between about 90°C to about 

25 95°C. 

22. The isolated or recombinant nucleic acid of claim 1 , wherein the 
xylanase activity is thermotoierant. 

30 23. The isolated or recombinant nucleic acid of claim 22, wherein the 

polypeptide retains a xylanase activity after exposure to a temperature in the range from 
greater than 37°C to about 95°C, from greater than 55°C to about 85°C, or between about 
70°C to about 75°C, or from greater than 90°C to about 95°C. 
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24. An isolated or recombinant nucleic acid, wherein the nucleic acid 
comprises a sequence that hybridizes under stringent conditions to a nucleic acid comprising 
SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, 
SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID 

5 NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, 
SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID 
NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, 
SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID 
NO:67, SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, 

10 SEQ ID NO:79, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID 
NO:89, SEQ ID NO:91, SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, 
SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, SEQ ID NO:107, SEQ ID NO:109, 
SEQ ID NO:l 11, SEQ ID NO:l 13, SEQ ID NO:l 15, SEQ ID NO:l 17, SEQ ID NO:l 19, 
SEQ ID NO:121, SEQ ID NO:123, SEQ ID NO:125, SEQ ID NO:127, SEQ ID NO:129, 

15 SEQ ID NO:131, SEQ ID NO:133, SEQ ID NO:135, SEQ ID NO:137, SEQ ID NO:139, 
SEQ ID NO:141, SEQ ID NO:143, SEQ ID NO:145, SEQ ID NO:147, SEQ ID NO:149, 
SEQ ID NO:151, SEQ ID NO:153, SEQ ID NO:155, SEQ ID NO:157, SEQ ID NO:199, 
SEQ ID NO:161, SEQ ID NO:163, SEQ ID NO:165, SEQ ID NO:167, SEQ ID NO:169, 
SEQ ID NO:171, SEQ ID NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID NO:179, 

20 SEQ ID NO:181, SEQ ID NO:183, SEQ ID NO:185, SEQ ID NO:187, SEQ ID NO:189, 
SEQ ID NO:191, SEQ ID NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID NO:199, 
SEQ ID NO:201, SEQ ID NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ID NO:209, 
SEQ ID NO:21 1, SEQ ID NO:213, SEQ ID NO:215, SEQ ID NO:217, SEQ ID NO:219, 
SEQ ID NO:221, SEQ ID NO:223, SEQ ID NO:225 5 SEQ ID NO:227, SEQ ID NO:229, 

25 SEQ ID NO:231, SEQ ID NO:233, SEQ ID NO:235, SEQ ID NO:237, SEQ ID NO:239, 
SEQ ID NO:241, SEQ ID NO:243, SEQ ID NO:245, SEQ ID NO:247, SEQ ID NO:249, 
SEQ ID NO:251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, 
SEQ ID NO:261, SEQ ID NO:263, SEQ ID NO:265, SEQ ID NO:267, SEQ ID NO:269, 
SEQ ID NO:271, SEQ ID NO:273, SEQ ID NO:275, SEQ ID NO:277, SEQ ID NO:279, 

30 SEQ ID NO:281, SEQ ID NO:283, SEQ ID NO:285, SEQ ID NO:287, SEQ ID NO:289, 
SEQ ID NO:291, SEQ ID NO:293, SEQ ID NO:295, SEQ ID NO:297, SEQ ID NO:299, 
SEQ ID NO:301, SEQ ID NO:303, SEQ ED NO:305, SEQ ID NO:307, SEQ ID NO:309, 
SEQ ID NO:311, SEQ ED NO:313, SEQ ED NO:315, SEQ ED NO:317, SEQ ED NO:319, 
SEQ ED NO:321, SEQ ED NO:323, SEQ ED NO:325, SEQ ED NO:327, SEQ ED NO:329, 
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SEQ ID NO:331, SEQ ID NO:333, SEQ ID NO:335, SEQ ID NO:337, SEQ ID NO:339, 
SEQ ID NO:341, SEQ ID NO:343, SEQ ID NO:345, SEQ ID NO:347, SEQ ID NO:349, 
SEQ ID NO:351, SEQ ID NO:353, SEQ ID NO:355, SEQ ID NO:357, SEQ ID NO:359, 
SEQ ID NO:361, SEQ ID NO:363, SEQ ID NO:365, SEQ ID NO:367, SEQ ID NO:369, 
5 SEQ ID NO:371 , SEQ ID NO:373, SEQ ID NO:375, SEQ ID NO:377 or SEQ ID NO:379, 
wherein the nucleic acid encodes a polypeptide having a xylanase activity. 

25 . The isolated or recombinant nucleic acid of claim 24, wherein the 
nucleic acid is at least about 50, 75, 100, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or 

10 more residues in length or the full length of the gene or transcript. 

26. The isolated or recombinant nucleic acid of claim 24, wherein the 
stringent conditions include a wash step comprising a wash in 0.2X SSC at a temperature of 
about 65°C for about 15 minutes. 

15 

27. A nucleic acid probe for identifying a nucleic acid encoding a 
polypeptide with a xylanase activity, wherein the probe comprises at least 10 consecutive 
bases of a sequence comprising SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, 
SEQ ID NO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID 

20 NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, 
SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID 
NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, 
SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID 
NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:73, 

25 SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ID NO:83, SEQ ID 
NO:85, SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91, SEQ ID NO:93, SEQ ID NO:95, 
SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO:101, SEQ ID NO: 103, SEQ ID NO.105, SEQ 
ID NO:107, SEQ ID NO:109, SEQ ID NO:lll, SEQ ID NO:113, SEQ ID NO:l 15, SEQ ID 
NO:l 17, SEQ ID NO:119, SEQ ID NO:121, SEQ ID NO:123, SEQ ID NO:125, SEQ ID 

30 NO:127, SEQ ID NO:129, SEQ ID NO:131, SEQ ID NO:133, SEQ ID NO:135, SEQ ID 
NO:137, SEQ ID NO:139, SEQ ID NO:141, SEQ ID NO:143, SEQ ID NO:145, SEQ ID 
NO:147, SEQ ID NO:149, SEQ ID NO:151, SEQ ID NO:153, SEQ ID NO:155, SEQ ID 
NO:157, SEQ ID NO:199, SEQ ID NO:161, SEQ ID NO:163, SEQ ID NO:165, SEQ ID 
NO: 167, SEQ ED N0:169, SEQ ID NO:171, SEQ ID NO:173, SEQ ID NO:175, SEQ ID 
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NO:177, SEQ ID NO:179, SEQ ID NO:181, SEQ ID NO:183, SEQ ID NO:185, SEQ ID 
NO:187, SEQ ID NO:189, SEQ ID NO:191, SEQ ID N0:193, SEQ ID NO:195, SEQ ID 
N0.197, SEQ ID NO:199, SEQ ID NO:201, SEQ ID NO:203, SEQ ID NO:205, SEQ ID 
NO.207, SEQ ID NO:209, SEQ ID N0:211, SEQ ID NO:213, SEQ ID NO:215, SEQ ID 
NO:217, SEQ ID NO:219, SEQ ID NO:221, SEQ ID NO:223, SEQ ID NO:225, SEQ JD 
NO:227, SEQ ID NO:229, SEQ ID NO:231, SEQ ID NO:233, SEQ ID NO:235, SEQ ID 
NO:237, SEQ ID NO:239, SEQ ID NO:241, SEQ ID NO:243, SEQ ID NO:245, SEQ ID 
NO:247, SEQ ID NO:249, SEQ ID NO:251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID 
NO:257, SEQ ID NO:259, SEQ ID NO:261, SEQ ID NO:263, SEQ ID NO:265, SEQ ID 
NO:267, SEQ ID NO:269, SEQ ID NO:271, SEQ ID NO:273, SEQ ID NO:275, SEQ ID 
NO:277, SEQ ID NO:279, SEQ ID NO:281, SEQ ID NO:283, SEQ ID NO:285, SEQ ID 
NO:287, SEQ ID NO:289, SEQ ID NO:291, SEQ ID NO:293, SEQ ID NO:295, SEQ ID 
NO:297, SEQ ID NO:299, SEQ ID NO:301, SEQ ID NO:303, SEQ ID NO:305, SEQ ID 
NO:307, SEQ ID NO:309, SEQ ID N0:311, SEQ ID NO:313, SEQ ID NO:315, SEQ ID 
NO:317, SEQ ID NO:319, SEQ ID NO:321, SEQ ID NO:323, SEQ ID NO:325, SEQ ID 
NO:327, SEQ ID NO:329, SEQ ID NO:331, SEQ ID NO:333, SEQ ID NO:335, SEQ ID 
NO:337, SEQ ID NO:339, SEQ ID NO:341, SEQ ID NO:343, SEQ ID NO:345, SEQ ID 
NO:347, SEQ ID NO:349, SEQ ID NO:351, SEQ ID NO:353, SEQ ID NO:355, SEQ ID 
NO:357, SEQ ID NO:359, SEQ ID NO:361, SEQ ID NO:363, SEQ ID NO:365, SEQ ID 
NO:367, SEQ ID NO:369, SEQ ID NO.371, SEQ ID NO:373, SEQ ID NO:375, SEQ ID 
NO:377 or SEQ ID NO:379, wherein the probe identifies the nucleic acid by binding or 
hybridization. 

28 . The nucleic acid probe of claim 27, wherein the probe comprises an 
oligonucleotide comprising at least about 10 to 50, about 20 to 60, about 30 to 70, about 40 to 
80, about 60 to 100, or about 50 to 150 consecutive bases. 

29. A nucleic acid probe for identifying a nucleic acid encoding a 
polypeptide having a xylanase activity, wherein the probe comprises a nucleic acid 
comprising at least about 10 consecutive residues of a nucleic acid sequence having at least 
50% sequence identity to SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ 
ID NO:9, SEQ ID NOrll, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, 
SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 
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SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID N0:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID N0:61, SEQ ID NO:63, 
SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID N0:71, SEQ ID NO:73, SEQ ID 
NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID N0:81, SEQ ID NO:83, SEQ ID NO:85, 

5 SEQ ID NO:87, SEQ ID NO:89, SEQ ID N0:91 , SEQ ID NO:93, SEQ ID NO:95, SEQ ID 
NO:97, SEQ ID NO:99, SEQ ID NOrlOl, SEQ ID NO:103, SEQ ID NO:105, SEQ ID 
NO:107, SEQ ID NO:109, SEQ ID NO:lll, SEQ ID NO:113, SEQ ID NO:115, SEQ ID 
NO:117, SEQ ID NO:119, SEQ ID NO:121, SEQ ID NO:123, SEQ ID NO:125, SEQ ID 
NO:127, SEQ ID NO:129, SEQ ID NO:131, SEQ ID NO:133, SEQ DDNO:135, SEQ ID 

10 NO:137, SEQ ID NO:139, SEQ ID NO:141, SEQ ID NO:143, SEQ ID NO: 145, SEQ ID 
NO:147, SEQ ID NO:149, SEQ ID NO:151, SEQ ID NO:153, SEQ ID NO:155, SEQ ID 
NO:157, SEQ ID NO:199, SEQ ID NO:161, SEQ ID NO:163, SEQ ID NO: 165, SEQ ID 
NO:167, SEQ ID NO:169, SEQ ID NO:171, SEQ ID NO: 173, SEQ ID NO:175, SEQ ID 
NO:177, SEQ ID NO:179, SEQ ID NO:181, SEQ ID NO:183, SEQ ID NO:185, SEQ ID 

15 NO:187, SEQ ID NO:189, SEQ ID NO:191, SEQ ID NO:193, SEQ ID NO:195, SEQ ID 
NO:197, SEQ ID NO:199, SEQ ID NO:201, SEQ ID NO:203, SEQ ID NO:205, SEQ ID 
NO:207, SEQ ID NO:209, SEQ ID NO:21 1, SEQ ID NO:213, SEQ ID NO:215, SEQ ID 
NO:217, SEQ ID NO:219, SEQ ID NO:221, SEQ ID NO:223, SEQ ID NO:225, SEQ ID 
NO:227, SEQ ID NO:229, SEQ ID NO:231, SEQ ID NO:233, SEQ ID NO:235, SEQ ID 

20 NO:237, SEQ ID NO:239, SEQ ED NO:241, SEQ ID NO:243, SEQ ID NO:245, SEQ ID 
NO:247, SEQ ID NO:249, SEQ ID NO:251, SEQ ED N0.253, SEQ EO NO:255, SEQ ID 
NO:257, SEQ ID NO:259, SEQ ID NO:261, SEQ ID N0.263, SEQ ID NO:265, SEQ ID 
NO:267, SEQ ID NO:269, SEQ ID NO:271, SEQ ID NO:273, SEQ ID NO:275, SEQ ID 
NO:277, SEQ ID NO:279, SEQ ID NO:281, SEQ ID NO:283, SEQ ID NO:285, SEQ ID 

25 NO:287, SEQ ID NO:289, SEQ ID NO:291, SEQ ED NO:293, SEQ ID NO:295, SEQ ED 
NO:297, SEQ ED NO:299, SEQ ID NO:301, SEQ ID NO:303, SEQ ID NO:305, SEQ ED 
NO:307, SEQ ID NO:309, SEQ ID NO:31 1, SEQ ID NO:313, SEQ ID NO:315, SEQ ID 
NO:317, SEQ ID NO:319, SEQ ID NO:321, SEQ ID NO:323, SEQ ID NO:325, SEQ ID 
NO:327, SEQ ED NO:329, SEQ ID NO:331, SEQ ED NO:333, SEQ ED NO:335, SEQ ED 

30 NO:337, SEQ ED NO:339, SEQ ID NO:341, SEQ ID NO:343, SEQ ID NO:345, SEQ ID 
NO:347, SEQ ED NO:349, SEQ ED NO:351, SEQ ID NO:353, SEQ ED NO:355, SEQ ED 
NO:357, SEQ ID NO:359, SEQ ID NO:361, SEQ ED NO:363, SEQ ID NO:365, SEQ ED 
NO:367, SEQ ID NO:369, SEQ ED NO:371, SEQ ID NO:373, SEQ ED NO:375, SEQ ED 
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NO:377 or SEQ ID NO:379, wherein the sequence identities are determined by analysis with 
a sequence comparison algorithm or by visual inspection. 



30. The nucleic acid probe of claim 29, wherein the probe comprises an 

5 oligonucleotide comprising at least about 10 to 50, about 20 to 60, about 30 to 70, about 40 to 
80, about 60 to 100, or about 50 to 150 consecutive bases. 

31. An amplification primer pair for amplifying a nucleic acid encoding a 
polypeptide having a xylanase activity, wherein the primer pair is capable of amplifying a 

10 nucleic acid comprising a sequence as set forth in claim 1 or claim 24, or a subsequence 
thereof. 



32. The amplification primer pair of claim 3 1 , wherein a member of the 
amplification primer sequence pair comprises an oligonucleotide comprising at least about 10 

15 to 50 consecutive bases of the sequence, or, about 12, 13, 14, 15, 16, 17, 18, 19. 20, 21, 22, 
23, 24, 25, 26, 27, 28, 29, 30 or more consecutive bases of the sequence. 

33 . An amplification primer pair, wherein the primer pair comprises a first 
member having a sequence as set forth by about the first (the 5') 12, 13, 14, 15, 16, 17, 18, 

20 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30 or more residues of SEQ ID NO: 1, SEQ ID 
NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ED NO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ 
ID NO:15, SEQ ID NO:17, SEQ ED NO: 19, SEQ ED NO:21, SEQ ID NO:23, SEQ ED NO:25, 
SEQ ID NO:27, SEQ ED NO:29, SEQ ED NO:31, SEQ ED NO:33, SEQ ID NO:35, SEQ ED 
NO:37, SEQ ED NO:39, SEQ ID NO:41, SEQ ED NO:43, SEQ ED NO:45, SEQ ED NO:47, 

25 SEQ ID NO:49, SEQ ED NO:51, SEQ ED NO:53, SEQ ED NO:55, SEQ ED NO:57, SEQ ED 
NO:59, SEQ ED NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ED NO:67, SEQ ID NO:69, 
SEQ ID NO:71, SEQ ED NO:73, SEQ ED NO:75, SEQ ED NO:77, SEQ ED NO:79, SEQ ED 
NO:81, SEQ ED NO:83, SEQ ID NO:85, SEQ ED NO:87, SEQ ED NO:89, SEQ ED NO:91, 
SEQ ID NO:93, SEQ ID NO:95, SEQ ED NO:97, SEQ ED NO:99, SEQ ED NO.101, SEQ ID 

30 NO:103, SEQ ID NO.105, SEQ ID NO:107, SEQ ED NO:109, SEQ ED NO:lll, SEQ ED 
NO:113, SEQ ED NO: 11 5, SEQIDNO:117, SEQ EDNO:119, SEQEDNO:121, SEQ ED 
NO:123, SEQ ID NO:125, SEQ ID NO:127, SEQ ID NO:129, SEQ ED NO:131, SEQ ED 
NO:133, SEQ ID NO:135, SEQ ID N0.137, SEQ ID NO:139, SEQ ED NO.141, SEQ ID 
NO:143, SEQ ID NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ED NO:151, SEQ ED 
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NO:153, SEQ ID NO:155, SEQ ID NO:157, SEQ ID NO:199, SEQ ID N0:161, SEQ ID 
NO:163, SEQ ID NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID N0:171, SEQ ID 
NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID N0:181, SEQ ID 
NO:183, SEQIDNO:185,SEQIDNO:187, SEQIDNO:189, SEQIDN0:191, SEQ ID 
5 NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID NO:201, SEQ ID 
NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ED N0:21 1, SEQ ID 
NO:213, SEQ ID NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ID NO:221, SEQ ID 
NO:223, SEQ ID N0.225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID NO:231, SEQ ID 
NO:233, SEQ ID NO:235, SEQ ID NO:237, SEQ ID NO:239, SEQ ID NO:241, SEQ ID 

10 NO:243, SEQ ID NO:245, SEQ ID NO:247, SEQ ID NO:249, SEQ ID NO:25 1, SEQ ID 
NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:261, SEQ ID 
NO:263, SEQ ID NO:265, SEQ ID NO:267, SEQ ID NO:269, SEQ ID NO:271, SEQ ID 
NO:273, SEQ ID NO:275, SEQ ID NO:277, SEQ ID NO:279, SEQ ID NO:281, SEQ ID 
NO:283, SEQ ID NO:285, SEQ ID NO:287, SEQ ID NO:289, SEQ ID NO:291, SEQ ID 

15 NO:293, SEQ ID NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ID NO:301, SEQ ID 
NO:303, SEQ ID NO:305, SEQ ID NO:307, SEQ ID NO:309, SEQ ID NO:31 1, SEQ ID 
NO:313, SEQ ID NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ID NO:321, SEQ ID 
NO:323, SEQ ID NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ID NO:331, SEQ ID 
NO:333, SEQ ID NO:335, SEQ ID NO:337, SEQ ID NO:339, SEQ ID NO:341, SEQ ID 

20 NO:343, SEQ ID NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ID N0.351, SEQ ID 
NO:353, SEQ ID NO:355, SEQ ID NO:357, SEQ ID N0.359, SEQ ID NO:361, SEQ ID 
NO:363, SEQ ID NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ID NO:371, SEQ ID 
NO:373, SEQ ID NO:375, SEQ ID NO:377 or SEQ ID NO:379, and a second member 
having a sequence as set forth by about the first (the 5') 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 

25 22, 23, 24, 25, 26, 27, 28, 29, 30 or more residues of the complementary strand of the first 
member. 

34. A xylanase-encoding nucleic acid generated by amplification of a 
polynucleotide using an amplification primer pair as set forth in claim 33. 

30 

35. The xylanase-encoding nucleic acid of claim 34, wherein the 
amplification is by polymerase chain reaction (PCR). 
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36. The xylanase-encoding nucleic acid of claim 34, wherein the nucleic 
acid generated by amplification of a gene library. 



37. The xylanase-encoding nucleic acid of claim 34, wherein the gene 
5 library is an environmental library. 

38. An isolated or recombinant xylanase encoded by a xylanase-encoding 
nucleic acid as set forth in claim 34. 

10 39. A method of amplifying a nucleic acid encoding a polypeptide having 

a xylanase activity comprising amplification of a template nucleic acid with an amplification 
primer sequence pair capable of amplifying a nucleic acid sequence as set forth in claim 1 or 
claim 24, or a subsequence thereof. 

15 40. An expression cassette comprising a nucleic acid comprising a 

sequence as set forth in claim 1 or claim 24. 

41 . A vector comprising a nucleic acid comprising a sequence as set forth 
in claim 1 or claim 24. 

20 

42. A cloning vehicle comprising a nucleic acid comprising a sequence as 
set forth in claim 1 or claim 24, wherein the cloning vehicle comprises a viral vector, a 
plasmid, a phage, a phagemid, a cosmid, a fosmid, a bacteriophage or an artificial 
chromosome. 

25 

43. The cloning vehicle of claim 42, wherein the viral vector comprises an 
adenovirus vector, a retroviral vector or an adeno-associated viral vector. 

44. The cloning vehicle of claim 42, comprising a bacterial artificial 

30 chromosome (B AC), a plasmid, a bacteriophage P 1 -derived vector (P AC), a yeast artificial 
chromosome (YAC), or a mammalian artificial chromosome (MAC). 

45. A transformed cell comprising a nucleic acid comprising a sequence as 
set forth in claim 1 or claim 24. 
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46. A transformed cell comprising an expression cassette as set forth in 

claim 40. 

5 47. The transformed cell of claim 40, wherein the cell is a bacterial cell, a 

mammalian cell, a fungal cell, a yeast cell, an insect cell or a plant cell. 

48. A transgenic non-human animal comprising a sequence as set forth in 
claim 1 or claim 24. 

10 

49. The transgenic non-human animal of claim 48, wherein the animal is a 

mouse. 



50. A transgenic plant comprising a sequence as set forth in claim 1 or 



15 claim 24. 



5 1 . The transgenic plant of claim 50, wherein the plant is a corn plant, a 
sorghum plant, a potato plant, a tomato plant, a wheat plant, an oilseed plant, a rapeseed 
plant, a soybean plant, a rice plant, a barley plant, a grass, or a tobacco plant. 

20 

52. A transgenic seed comprising a sequence as set forth in claim 1 or 

claim 24. 

53. The transgenic seed of claim 52, wherein the seed is a com seed, a 
25 wheat kernel, an oilseed, a rapeseed, a soybean seed, a palm kernel, a sunflower seed, a 

sesame seed, a rice, a barley, a peanut or a tobacco plant seed. 

54. An antisense oligonucleotide comprising a nucleic acid sequence 
complementary to or capable of hybridizing under stringent conditions to a sequence as set 

30 forth in claim 1 or claim 24, or a subsequence thereof. 

55. The antisense oligonucleotide of claim 49, wherein the antisense 
oligonucleotide is between about 10 to 50, about 20 to 60, about 30 to 70, about 40 to 80, or 
about 60 to 100 bases in length. 
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56. A method of inhibiting the translation of a xylanase message in a cell 
comprising administering to the cell or expressing in the cell an antisense oligonucleotide 
comprising a nucleic acid sequence complementary to or capable of hybridizing under 

5 stringent conditions to a sequence as set forth in claim 1 or claim 24. 

57. A double-stranded inhibitory RNA (RNAi) molecule comprising a 
subsequence of a sequence as set forth in claim 1 or claim 24. 

10 58. The double-stranded inhibitory RNA (RNAi) molecule of claim 52, 

wherein the RNAi is about 15, 16, 17, 1 8, 19, 20, 21, 22, 23, 24, 25 or more duplex 
nucleotides in length. 

59. A method of inhibiting the expression of a xylanase in a cell 

15 comprising administering to the cell or expressing in the cell a double-stranded inhibitory 
RNA (iRNA), wherein the RNA comprises a subsequence of a sequence as set forth in claim 
1 or claim 24. 

60. An isolated or recombinant polypeptide (i) having at least 50% 

20 sequence identity to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, 
SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 
NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 

25 NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, 
SEQ ID NO:66, SEQ ID NO:68, SEQ ED NO:70, SEQ ID NO:72, SEQ ID NO:74, SEQ ID 
NO:76, SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, 
SEQ ID NO:88, SEQ ID NO:90, SEQ ID NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID 
NO:98, SEQ ID NO:100, SEQ ID NO:102, SEQ ID NO:104, SEQ ID NO:106, SEQ ID 

30 NO:108, SEQ ID NO:l 10, SEQ ID NO:l 12, SEQ ID NO: 1 14, SEQ ID NO: 1 16, SEQ ID 
NO:l 18, SEQ ID NO:120, SEQ ID NO:122, SEQ ID NO:124, SEQ ID NO:126, SEQ ID 
NO:128, SEQ ID NO:130, SEQ ID NO:132; SEQ ID NO:134; SEQ ID NO:136; SEQ ID 
NO:138; SEQ ID NO:140; SEQ ID N0.142; SEQ ID NO:144; NO:146, SEQ ID N0.148, 
SEQ ID NO.150, SEQ ID N0:152, SEQ ID N0.154, SEQ ID N0.156, SEQ ID N0.158, 
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SEQ ID NO:160, SEQ ID NO:162, SEQ ID NO:164, SEQ ID NO:166, SEQ ID NO:168, 
SEQ ID NO:170, SEQ ID NO:172, SEQ ID N0.174, SEQ ID NO:176, SEQ ID NO:178, 
SEQ ID NO: 180, SEQ ID NO: 182, SEQ ID NO: 184, SEQ ID NO: 1 86, SEQ ID NO: 188, 
SEQ ID NO:190, SEQ ID NO:192, SEQ ID NO:194, SEQ ID NO:196, SEQ IDNO:198, 
5 SEQ ID NO:200, SEQ ID NO:202, SEQ ID NO:204, SEQ ID NO:206, SEQ ID NO:208, 
SEQ ID NO.-210, SEQ ID N0.212, SEQ ID NO:214, SEQ ID NO:216, SEQ ID NO:218, 
SEQ ID NO:220, SEQ ID NO:222, SEQ ID NO:224, SEQ ID NO:226, SEQ ID NO:228, 
SEQ E) NO:230, SEQ ID NO:232, SEQ ID NO:234, SEQ ID NO:236, SEQ ID NO:238, 
SEQ ID NO:240, SEQ ID NO:242, SEQ ID NO:244, SEQ ID NO:246, SEQ ID NO:248, 

10 SEQ ID NO:250, SEQ ID NO:252, SEQ ID NO:254, SEQ ID NO:256, SEQ ID NO:258, 
SEQ ID NO-.260, SEQ ID NO:262, SEQ ID NO:264, SEQ ID NO:266, SEQ ID NO:268, 
SEQ ID NO:270, SEQ ID NO:272, SEQ ID NO:274, SEQ ID NO:276, SEQ ID NO:278, 
SEQ ID NO.280, SEQ ID NO:282, SEQ ID NO:284, SEQ ID NO:286, SEQ ID NO:288, 
SEQ ID NO:290, SEQ ID NO:292, SEQ ID NO:294, SEQ ID NO:296, SEQ ID NO:298, 

15 SEQ ID NO:300, SEQ ID NO:302, SEQ ID NO:304, SEQ ID NO:306, SEQ ID NO:308, 
SEQ ID NO:310, SEQ ID NO:312, SEQ ID NO:314, SEQ ID NO:316, SEQ ID NO:318, 
SEQ ID NO:320, SEQ ID NO:322, SEQ ID NO:324, SEQ ID N0.326, SEQ ID NO:328, 
SEQ ID NO:330, SEQ ID NO:332, SEQ ID NO:334, SEQ ED NO:336, SEQ ID NO:338, 
SEQ ID NO:340, SEQ ID NO:342, SEQ ID NO:344, SEQ ID NO:346, SEQ ID NO:348, 

20 SEQ ID NO:350, SEQ ID NO:352, SEQ ID NO:354, SEQ ID NO:356, SEQ ID N0.358, 
SEQ ID NO:360, SEQ ID NO:362, SEQ ID NO:364, SEQ ID NO:366, SEQ ID NO:368, 
SEQ ID NO:370, SEQ ID NO:372, SEQ ID NO:374, SEQ ID NO:376, SEQ ID NO:378 or 
SEQ ID NO:380, over a region of at least about 100 residues, wherein the sequence identities 
are determined by analysis with a sequence comparison algorithm or by a visual inspection, 

25 or, (ii) encoded by a nucleic acid having at least 50% sequence identity to a sequence as set 
forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NO:l 1, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, 
SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID 
NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, 

30 SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ED 
NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, 
SEQ ID NO:67, SEQ ID NO:69, SEQ ED NO:71, SEQ ED NO:73, SEQ ID NO:75, SEQ ED 
NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ED NO:83, SEQ ID NO:85, SEQ ED NO:87, 
SEQ ID NO:89, SEQ ID NO:91, SEQ ED NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ED 
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NO:99, SEQ ID NO.101, SEQ JD NO:103, SEQ ID NO:105, SEQ ID NO:107, SEQ ID 
NO:109, SEQ ID NO: 1 1 1, SEQ ID NO:113, SEQ ID N0:1 15, SEQ ID N0:1 17, SEQ ED 
N0:119, SEQ ID NO: 121, SEQ ID NO:123, SEQ ID NO:125, SEQ ED NO:127, SEQ ID 
NO:129, SEQ ID N0:131, SEQ ID NO:133, SEQ ID NO:135, SEQ ID NO:137, SEQ ID 
5 NO:139, SEQ ED N0:141, SEQ ID NO:143, SEQ ID NO:145, SEQ ED NO:147, SEQ ED 
NO:149, SEQ ID N0:151, SEQ ID NO:153, SEQ ID NO:155, SEQ ED NO:157, SEQ ID 
NO:199, SEQ ID N0:161, SEQ ID NO:163, SEQ ID NO:165, SEQ ID N0.167, SEQ ID 
NO:169, SEQ ID N0:171, SEQ ID NO:173, SEQ ID NO:175, SEQ ID N0:177, SEQ ID 
NO:179, SEQ ID N0:181, SEQ ID NO:183, SEQ ID NO:185, SEQ ID NO:187, SEQ ID 

10 NO:189, SEQ ID NO:191, SEQ ED NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID 
NO:199, SEQ ID NO:201, SEQ ID NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ED 
NO:209, SEQ ID NO:21 1, SEQ ID NO:213, SEQ ID NO:215, SEQ ED NO:217, SEQ ID 
NO:219, SEQ ED NO:221, SEQ ED NO:223, SEQ ID NO:225, SEQ ID NO:227, SEQ ID 
NO:229, SEQ ID NO:231, SEQ ED NO:233, SEQ ED NO:235, SEQ ID NO:237, SEQ ED 

1 5 NO:239, SEQ ID NO:241, SEQ ED NO:243, SEQ ID NO:245, SEQ ED NO:247, SEQ ED 
NO:249, SEQ ED NO:251, SEQ ED NO:253, SEQ ED NO:255, SEQ ID NO:257, SEQ ED 
NO:259, SEQ ID NO:261, SEQ ED NO:263, SEQ ED NO:265 5 SEQ ED NO:267, SEQ ED 
NO:269, SEQ ED NO:27l, SEQ ED NO:273, SEQ ID NO:275, SEQ ID NO:277, SEQ ED 
NO:279, SEQ ED NO:281, SEQ ED NO:283, SEQ ID NO:285, SEQ ED NO:287, SEQ ED 

20 NO:289, SEQ ED NO:29 1, SEQ ED NO:293, SEQ ID NO:295, SEQ ID NO:297, SEQ ED 
NO:299, SEQ ED NO:301, SEQ ED NO:303, SEQ ED NO:305, SEQ ED NO:307, SEQ ED 
NO:309, SEQ ED N0:31 1, SEQ ED NO:313, SEQ ED NO:315, SEQ ID NO:317, SEQ ED 
NO:319, SEQ ED NO:321, SEQ ED NO:323, SEQ ED NO:325, SEQ ED NO:327, SEQ ED 
NO:329, SEQ ED NO:331, SEQ ED N0.333, SEQ ED NO:335, SEQ ID NO:337, SEQ ED 

25 NO:339, SEQ ED NO:341, SEQ ED NO:343, SEQ ED NO:345, SEQ ED NO:347, SEQ ED 
NO:349, SEQ ED NO:351, SEQ ED N0.353, SEQ ED NO:355, SEQ ED NO:357, SEQ ED 
NO:359, SEQ ED NO:361, SEQ ED NO:363, SEQ ED NO:365, SEQ ED NO:367, SEQ ED 
NO:369, SEQ ED NO:371, SEQ ED NO:373, SEQ ED NO:375, SEQ ED NO:377 or SEQ ID 
NO:379, over a region of at least about 100 residues, and the sequence identities are 

30 determined by analysis with a sequence comparison algorithm or by a visual inspection, or 
encoded by a nucleic acid capable of hybridizing under stringent conditions to a sequence as 
set forth in SEQ ED NO. l, SEQ ED NO:3, SEQ ED NO:5, SEQ ED NO:7, SEQ ED NO:9, SEQ 
ID NO.ll, SEQ ED NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ED NO:19, SEQ ED NO:21, 
SEQ ED NO:23, SEQ ED NO:25, SEQ ID NO:27, SEQ ED NO:29, SEQ ID NO:31, SEQ ED 
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NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID N0:41, SEQ ID NO:43, 
SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID N0:51, SEQ ID NO:53, SEQ ID 
NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID N0:61, SEQ ID NO:63, SEQ ID NO:65, 
SEQ ED NO:67, SEQ ID NO:69, SEQ ID N0:71, SEQ ID NO:73, SEQ ID NO:75, SEQ ID 
NO:77, SEQ ID NO:79, SEQ ID N0:81, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:87, 
SEQ ID NO:89, SEQ DD N0:91, SEQ ID NO:93, SEQ ID NO:95, SEQ ID NO:97, SEQ ID 
NO:99, SEQ ID NO:101, SEQ ID NO:103, SEQ ID NO:105, SEQ ID NO:107, SEQ ID 
NO:109, SEQ ID NO:lll, SEQ ID N0:113, SEQ ID N0:115, SEQ ID N0:117, SEQ ID 



NO:119,SEQIDNO:121 

10 NO:129,SEQIDNO:131 
NO:139,SEQIDNO:141 
NO:149,SEQE>NO:151 
NO:199,SEQIDNO:161 
NO:169,SEQIDNO:171 

15 NO:179,SEQIDNO:181 
NO:189,SEQIDNO:191 
NO:199,SEQIDNO:201 
NO:209, SEQIDNO:211 
NO:219, SEQIDNO:221 

20 NO:229, SEQ ID NO:23 1 
NO:239,SEQIDNO:241 
NO:249, SEQIDNO:251 
NO:259,SEQIDNO:261 
NO:269,SEQIDNO:271 

25 NO:279, SEQ ID NO:281 
NO:289, SEQIDNO:291 
NO:299, SEQIDNO:301 
NO:309,SEQIDNO:311 
NO:319,SEQIDNO:321 

30 NO:329,SEQIDNO:331 
NO:339, SEQIDNO:341 
NO:349,SEQIDNO:351 
NO:359,SEQIDNO:361 



SEQ ID NO: 123, SEQ ID NO: 125, SEQ ID NO: 127, SEQ ID 
SEQ ID NO: 133, SEQ ID NO: 135, SEQ ID NO: 137, SEQ ID 
SEQ ID NO: 143, SEQ ID NO: 145, SEQ ID NO: 147, SEQ ID 
SEQ ID NO:153, SEQ ID NO:155, SEQ ID NO:157, SEQ ID 
SEQ ID NO:163, SEQ ID NO:165, SEQ ID NO:167, SEQ ID 
SEQ ID NO:173, SEQ ID NO:175, SEQ ID NO:177, SEQ ID 
SEQ ID NO:183, SEQ ID NO:185, SEQ ID NO:187, SEQ ID 
SEQ ID NO:193, SEQ ID NO:195, SEQ ID NO:197, SEQ ID 
SEQ ID NO:203, SEQ ID NO:205, SEQ ID NO:207, SEQ ID 
SEQ ID NO:213, SEQ ID NO:215, SEQ ID NO:217, SEQ ID 
SEQ ID NO:223, SEQ ID NO:225, SEQ ID N0.227, SEQ ID 
SEQ ID NO:233, SEQ ID NO:235, SEQ ID NO:237, SEQ ID 
SEQ ID NO:243, SEQ ID NO:245, SEQ ID NO:247, SEQ ID 
SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID 
SEQ ID NO:263, SEQ ID NO:265, SEQ ID NO:267, SEQ ID 
SEQ ID NO:273, SEQ ID NO:275, SEQ ID NO:277, SEQ ID 
SEQ ID NO:283, SEQ ID NO:285, SEQ ID NO:287, SEQ ID 
SEQ ID NO:293, SEQ ID NO:295, SEQ ID NO:297, SEQ ID 
SEQ ID NO:303, SEQ ID NO:305, SEQ ID NO:307, SEQ ID 
SEQ ID NO:313, SEQ ID NO:315, SEQ ID NO:317, SEQ ID 
SEQ ID NO:323, SEQ ID NO:325, SEQ ID NO:327, SEQ ID 
SEQ ID NO:333, SEQ ID NO:335, SEQ ID NO:337, SEQ ID 
SEQ ID NO:343, SEQ ID NO:345, SEQ ID NO:347, SEQ ID 
SEQ ID NO:353, SEQ ID NO:355, SEQ DD NO:357, SEQ ID 
SEQ ID NO:363, SEQ ID NO:365, SEQ ID NO:367, SEQ ID 
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NO:369, SEQ ID NO:371, SEQ ID NO:373, SEQ ID NO:375, SEQ ID NO:377 or SEQ ID 
NO:379. 



61 . The isolated or recombinant polypeptide of claim 60, wherein the 

5 sequence identity is over a region of at least about at least about 51%, 52%, 53%, 54%, 55%, 
56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 
72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 
88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more, or is 100% 
sequence identity. 

10 

62. The isolated or recombinant polypeptide of claim 60, wherein the 
sequence identity is over a region of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 
150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 
1050 or more residues, or the full length of an enzyme. 

15 

63 . The isolated or recombinant polypeptide of claim 60, wherein the 
polypeptide has a sequence as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ 
ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID NO:16, SEQ ID NO:18, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID 

20 NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, . 
SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID 
NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, 
SEQ ID NO:64, SEQ ID NO:66, SEQ ED NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ ID 
NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, 

25 SEQ ID NO:86, SEQ ID NO:88, SEQ ID NO:90, SEQ ID NO:92, SEQ ID NO:94, SEQ ID 
NO:96, SEQ ID NO:98, SEQ ID NO:100, SEQ ID NO:102, SEQ ID NO:104, SEQ ID 
NO:106, SEQ ID NO:108, SEQ ID NO:110, SEQ ID NO:112, SEQ ID NO:114, SEQ ID 
NO.l 16, SEQ ID NO.l 18, SEQ ID NO: 120, SEQ ID NO:122, SEQ ID NO:124, SEQ ID 
NO:126, SEQ ID NO:128, SEQ ID NO:130, SEQ ID NO:132; SEQ ID NO:134; SEQ ID 

30 NO:136; SEQ ID NO:138; SEQ ID NO:140; SEQ ID NO:142; SEQ ID NO:144; NO:146, 
SEQ ID NO:148, SEQ ID NO:150, SEQ ID NO:152, SEQ ID NO:154, SEQ ID NO:156, 
SEQ ID NO:158, SEQ ID NO:160, SEQ ID NO:162, SEQ ID NO:164, SEQ ID NO:166, 
SEQ ID NO:168, SEQ ID NO:170, SEQ ID NO:172, SEQ ID NO:174, SEQ ID NO:176, 
SEQ ID NO:178, SEQ ID NO:180, SEQ ID NO:182, SEQ ID NO:184, SEQ ID NO:186, 

235 



WO 03/106654 PCT/US03/19153 

SEQ ID NO:188, SEQ ID NO:190, SEQ ID NO:192, SEQ ID NO:194, SEQ ID NO:196, 
SEQ ID N0.198, SEQ ID NO:200, SEQ ID NO:202, SEQ ID NO.204, SEQ ID NO.206, 
SEQ ID NO:208, SEQ ID NO:210, SEQ ID NO:212, SEQ ED NO:214, SEQ ID NO:216, 
SEQ ID NO:218, SEQ ID NO:220, SEQ ID NO:222, SEQ ID NO:224, SEQ ED NO:226, 
5 SEQ ID NO:228, SEQ ED NO:230, SEQ ED NO:232, SEQ ID NO-.234, SEQ ED NO:236, 
SEQ ED NO:238, SEQ ID NO:240, SEQ ID NO:242, SEQ ED NO:244, SEQ ED NO:246, 
SEQ ED NO:248, SEQ ID NO:250, SEQ ED NO:252, SEQ ED NO:254, SEQ ED NO:256, 
SEQ ID NO:258, SEQ ED NO:260, SEQ ID NO:262, SEQ ED NO:264, SEQ ED NO:266, 
SEQ ID NO:268, SEQ ID NO:270, SEQ ED NO:272, SEQ ID NO:274, SEQ ID NO:276, 

10 SEQ DD NO:278, SEQ ID NO:280, SEQ ID NO:282, SEQ ID NO:284, SEQ ID NO:286, 
SEQ DD NO:288, SEQ ID NO:290, SEQ DD NO:292, SEQ D3 NO:294, SEQ ID NO:296, 
SEQ ID NO:298, SEQ ID NO.300, SEQ ED NO:302, SEQ ID NO:304, SEQ DD NO:306, 
SEQ DD NO:308, SEQ ID NO:310, SEQ DD NO:312, SEQ DD NO:314, SEQ DD NO:316, 
SEQ DD NO:318, SEQ DD NO:320, SEQ DD NO:322, SEQ ED NO:324, SEQ ED NO:326, 

15 SEQ DD NO:328, SEQ DD NO:330, SEQ DD NO:332, SEQ E> NO:334, SEQ ED NO:336, 
SEQ DD NO:338, SEQ DD NO:340, SEQ DD NO:342, SEQ DD NO:344, SEQ DD NO:346, 
SEQ DD NO:348, SEQ ED NO:350, SEQ ED NO:352, SEQ ID NO:354, SEQ ED NO:356, 
SEQ DD NO:358, SEQ DD NO:360, SEQ DD NO:362, SEQ DD NO:364, SEQ DD NO:366, 
SEQ ED NO:368, SEQ DD NO:370, SEQ DD NO:372, SEQ ID NO:374, SEQ DD NO:376, 

20 SEQ DD NO:378 or SEQ ED NO:380. 

64. The isolated or recombinant polypeptide of claim 60, wherein the 
polypeptide has a xylanase activity. 

25 65. The isolated or recombinant polypeptide of claim 64, wherein the 

xylanase activity comprises catalyzing hydrolysis of internal P-l,4-xylosidic linkages. 

66. The isolated or recombinant polypeptide of claim 65, wherein the 
xylanase activity comprises an endo-l,4-beta-xylanase activity. 

30 

67. The isolated or recombinant polypeptide of claim 64, wherein the 
xylanase activity comprises hydrolyzing a xylan to produce a smaller molecular weight 
xylose and xylo-oligomer. 

236 



WO 03/106654 PCT7US03/19153 

68. The isolated or recombinant polypeptide of claim 67, wherein the 
xylan comprises an arabinoxylan. 



69. The isolated or recombinant polypeptide of claim 68, wherein the 
5 arabinoxylan comprises a water soluble arabinoxylan. 

70. The isolated or recombinant polypeptide of claim 69, wherein the 
water soluble arabinoxylan comprises a dough or a bread product 

10 71. The isolated or recombinant polypeptide of claim 64, wherein the 

xylanase activity comprises hydrolyzing polysaccharides comprising 1,4-p-glycoside-linked 
D-xylopyranoses. 

72. The isolated or recombinant polypeptide of claim 64, wherein the 
1 5 xylanase activity comprises hydrolyzing hemicelluloses. 

73. The isolated or recombinant polypeptide of claim 72, wherein the 
xylanase activity comprises hydrolyzing hemicelluloses in a wood or paper pulp or a paper 
product 

20 

74. The isolated or recombinant polypeptide of claim 73, wherein the 
xylanase activity comprises catalyzing hydrolysis of xylans in a feed or a food product. 

75. The isolated or recombinant polypeptide of claim 74, wherein the feed 
25 or food product comprises a cereal-based animal feed, a wort or a beer, a milk or a milk 

product, a fruit or a vegetable. 

76. The isolated or recombinant polypeptide of claim 64, wherein the 
xylanase activity comprises catalyzing hydrolysis of xylans in a microbial cell or a plant cell. 

30 

77. The isolated or recombinant polypeptide of claim 64, wherein the 
xylanase activity is thermostable. 
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78. The isolated or recombinant polypeptide of claim 77, wherein the 
polypeptide retains a xylanase activity under conditions comprising a temperature range of 
between about 1°C to about 5°C, between about 5°C to about 15°C, between about 15°C to 
about 25°C, between about 25°C to about 37°C, between about 37°C to about 95°C, between 
about 55°C to about 85°C, between about 70°C to about 95°C, between about 70°C to about 
75°C, or between about 90°C to about 95°C. 

79. The isolated or recombinant polypeptide of claim 64, wherein the 
xylanase activity is thermotolerant. 



80. The isolated or recombinant polypeptide of claim 79, wherein the 
polypeptide retains a xylanase activity after exposure to a temperature in the range from 
between about 1°C to about 5°C, between about 5°C to about 15°C, between about 15°C to 
about 25°C, between about 25°C to about 37°C, between about 37°C to about 95°C, between 

15 about 55°C to about 85°C, between about 70°C to about 75°C, or between about 90°C to about 
95°C,ormore. 

81. An isolated or recombinant polypeptide comprising a polypeptide as 
set forth in claim 60 and lacking a signal sequence or a prepro sequence. 

20 

82. An isolated or recombinant polypeptide comprising a polypeptide as 
set forth in claim 60 and having a heterologous signal sequence or a heterologous prepro 
sequence. 

25 83. The isolated or recombinant polypeptide of claim 64, wherein the 

xylanase activity comprises a specific activity at about 37°C in the range from about 100 to 
about 1000 units per milligram of protein, from about 500 to about 750 units per milligram of 
protein, from about 500 to about 1200 units per milligram of protein, or from about 750 to 
about 1000 units per milligram of protein. 



84. The isolated or recombinant polypeptide of claim 79, wherein the 
thermotolerance comprises retention of at least half of the specific activity of the xylanase at 
37°C after being heated to an elevated temperature. 
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85. The isolated or recombinant polypeptide of claim 79, wherein the 
thermotolerance comprises retention of specific activity at 37°C in the range from about 500 
to about 1200 units per milligram of protein after being heated to an elevated temperature. 

5 86. The isolated or recombinant polypeptide of claim 60, wherein the 

polypeptide comprises at least one glycosylation site. 

87. The isolated or recombinant polypeptide of claim 86, wherein the 
glycosylation is an N-linked glycosylation. 

10 

88. The isolated or recombinant polypeptide of claim 87, wherein the 
polypeptide is glycosylated after being expressed in a P. pastoris or a S. pombe. 

89. The isolated or recombinant polypeptide of claim 64, wherein the 

15 polypeptide retains a xylanase activity under conditions comprising about pH 6.5, pH 6.0, pH 
5.5, 5.0, pH 4.5 or 4.0. 

90. The isolated or recombinant polypeptide of claim 64, wherein the 
polypeptide retains a xylanase activity under conditions comprising about pH 7.5, pH 8.0, pH 

20 8.5, pH 9, pH 9.5, pH 10 or pH 10.5. 

91. A protein preparation comprising a polypeptide as set forth in claim 
60, wherein the protein preparation comprises a liquid, a solid or a gel. 

25 92. A heterodimer comprising a polypeptide as set forth in claim 60 and a 

second domain. 

93. The heterodimer of claim 92, wherein the second domain is a 
polypeptide and the heterodimer is a fusion protein. 

30 

94. The heterodimer of claim 92, wherein the second domain is an epitope 

or a tag. 

95. A homodimer comprising a polypeptide as set forth in claim 60. 
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96. An immobilized polypeptide, wherein the polypeptide comprises a 
sequence as set forth in claim 60, or a subsequence thereof. 

5 97. The immobilized polypeptide of claim 96, wherein the polypeptide is 

immobilized on a cell, a metal, a resin, a polymer, a ceramic, a glass, a microelectrode, a 
graphitic particle, a bead, a gel, a plate, an array or a capillary tube. 

98. An array comprising an immobilized polypeptide as set forth in claim 

10 60. 



99. An array comprising an immobilized nucleic acid as set forth in claim 

1 or claim 24. 



15 1 00. An isolated or recombinant antibody that specifically binds to a 

polypeptide as set forth in claim 60. 

1 01 . The isolated or recombinant antibody of claim 1 00, wherein the 
antibody is a monoclonal or a polyclonal antibody. 

20 . 

102. A hybridoma comprising an antibody that specifically binds to a 
polypeptide as set forth in claim 60. 



1 03 . A method of isolating or identifying a polypeptide with a xylanase 
25 activity comprising the steps of: 

(a) providing an antibody as set forth in claim 100; 

(b) providing a sample comprising polypeptides; and 

(c) contacting the sample of step (b) with the antibody of step (a) under 
conditions wherein the antibody can specifically bind to the polypeptide, thereby isolating or 

30 identifying a polypeptide having a xylanase activity. 

104. A method of making an anti-xylanase antibody comprising 
administering to a non-human animal a nucleic acid as set forth in claim 1 or claim 24 or a 
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subsequence thereof in an amount sufficient to generate a humoral immune response, thereby 
making an anti-xylanase antibody. 

1 05 . A method of making an anti-xylanase antibody comprising 

5 administering to a non-human animal a polypeptide as set forth in claim 60 or a subsequence 
thereof in an amount sufficient to generate a humoral immune response, thereby making an 
anti-xylanase antibody. 

1 06. A method of producing a recombinant polypeptide comprising the 

i0 steps of: (a) providing a nucleic acid operably linked to a promoter, wherein the nucleic acid 
comprises a sequence as set forth in claim 1 or claim 24; and (b) expressing the nucleic acid 
of step (a) under conditions that allow expression of the polypeptide, thereby producing a 
recombinant polypeptide. 



15 107. The method of claim 106, further comprising transforming a host cell 

with the nucleic acid of step (a) followed by expressing the nucleic acid of step (a), thereby 
producing a recombinant polypeptide in a transformed cell. 



108. A method for identifying a polypeptide having a xylanase activity 
20 comprising the following steps: 

(a) providing a polypeptide as set forth in claim 64; 

(b) providing a xylanase substrate; and 

(c) contacting the polypeptide with the substrate of step (b) and detecting a 
decrease in the amount of substrate or an increase in the amount of a reaction product, 

25 wherein a decrease in the amount of the substrate or an increase in the amount of the reaction 
product detects a polypeptide having a xylanase activity. 

1 09. A method for identifying a xylanase substrate comprising the 
following steps: 

30 (a) providing a polypeptide as set forth in claim 64; 

(b) providing a test substrate; and 

(c) contacting the polypeptide of step (a) with the test substrate of step (b) and 
detecting a decrease in the amount of substrate or an increase in the amount of reaction 
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product, wherein a decrease in the amount of the substrate or an increase in the amount of a 
reaction product identifies the test substrate as a xylanase substrate. 



110. A method of determining whether a test compound specifically binds 
5 to a polypeptide comprising the following steps: 

(a) expressing a nucleic acid or a vector comprising the nucleic acid under 
conditions permissive for translation of the nucleic acid to a polypeptide, wherein the nucleic 
acid has a sequence as set forth in claim 1 or claim 24; 

(b) providing a test compound; 

10 (c) contacting the polypeptide with the test compound; and 

(d) determining whether the test compound of step (b) specifically binds to the 

polypeptide. 

111. A method of determining whether a test compound specifically binds 
15 to a polypeptide comprising the following steps: 

(a) providing a polypeptide as set forth in claim 60; 

(b) providing a test compound; 

(c) contacting the polypeptide with the test compound; and 

(d) determining whether the test compound of step (b) specifically binds to the , 

20 polypeptide. 

112. A method for identifying a modulator of a xylanase activity comprising 
the following steps: 

(a) providing a polypeptide as set forth in claim 64; 
25 (b) providing a test compound; 

(c) contacting the polypeptide of step (a) with the test compound of step (b) 
and measuring an activity of the xylanase, wherein a change in the xylanase activity 
measured in the presence of the test compound compared to the activity in the absence of the 
test compound provides a determination that the test compound modulates the xylanase 
30 activity. 

113. The method of claim 112, wherein the xylanase activity is measured by 
providing a xylanase substrate and detecting a decrease in the amount of the substrate or an 
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increase in the amount of a reaction product, or, an increase in the amount of the substrate or 
a decrease in the amount of a reaction product. 



114. The method of claim 113, wherein a decrease in the amount of the 
5 substrate or an increase in the amount of the reaction product with the test compound as 

compared to the amount of substrate or reaction product without the test compound identifies 
the test compound as an activator of a xylanase activity. 

115. The method of claim 113, wherein an increase in the amount of the 
10 substrate or a decrease in the amount of the reaction product with the test compound as 

compared to the amount of substrate or reaction product without the test compound identifies 
the test compound as an inhibitor of a xylanase activity. 

116. A computer system comprising a processor and a data storage device 
15 wherein said data storage device has stored thereon a polypeptide sequence or a nucleic acid 

sequence, wherein the polypeptide sequence comprises sequence as set forth in claim 60, a 
polypeptide encoded by a nucleic acid as set forth in claim 1 or claim 24. 

117. The computer system of claim 1 15, further comprising a sequence 

20 comparison algorithm and a data storage device having at least one reference sequence stored 
thereon. 

118. The computer system of claim 117, wherein the sequence comparison 
algorithm comprises a computer program that indicates polymorphisms. 

25 

119. The computer system of claim 117, further comprising an identifier 
that identifies one or more features in said sequence. 

120. A computer readable medium having stored thereon a polypeptide 
30 sequence or a nucleic acid sequence, wherein the polypeptide sequence comprises a 

polypeptide as set forth in claim 60; a polypeptide encoded by a nucleic acid as set forth in 
claim 1 or claim 24. 
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121 . A method for identifying a feature in a sequence comprising the steps 
of: (a) reading the sequence using a computer program which identifies one or more features 
in a sequence, wherein the sequence comprises a polypeptide sequence or a nucleic acid 
sequence, wherein the polypeptide sequence comprises a polypeptide as set forth in claim 60; 

5 a polypeptide encoded by a nucleic acid as set forth in claim 1 or claim 24; and (b) 
identifying one or more features in the sequence with the computer program. 

122. A method for comparing a first sequence to a second sequence 
comprising the steps of: (a) reading the first sequence and the second sequence through use 

10 of a computer program which compares sequences, wherein the first sequence comprises a 
polypeptide sequence or a nucleic acid sequence, wherein the polypeptide sequence 
comprises a polypeptide as set forth in claim 60 or a polypeptide encoded by a nucleic acid as 
set forth in claim 1 or claim 24; and (b) determining differences between the first sequence 
and the second sequence with the computer program. 

15 

123. The method of claim 122, wherein the step of determining differences 
between the first sequence and the second sequence further comprises the step of identifying 
polymorphisms. 

20 124. The method of claim 123, further comprising an identifier that 

identifies one or more features in a sequence. 

125. The method of claim 124, comprising reading the first sequence using 
a computer program and identifying one or more features in the sequence. 

25 

126. A method for isolating or recovering a nucleic acid encoding a 
polypeptide with axylanase activity from an environmental sample comprising the steps of: 

(a) providing an amplification primer sequence pair as set forth in claim 3 1 or 

claim 33; 

30 (b) isolating a nucleic acid from the environmental sample or treating the 

environmental sample such that nucleic acid in the sample is accessible for hybridization to 
the amplification primer pair; and, 

(c) combining the nucleic acid of step (b) with the amplification primer pair of 
step (a) and amplifying nucleic acid from the environmental sample, thereby isolating or 
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recovering a nucleic acid encoding a polypeptide with a xylanase activity from an 
environmental sample. 



127. The method of claim 126, wherein each member of the amplification 
5 primer sequence pair comprises an oligonucleotide comprising at least about 1 0 to 50 

consecutive bases of a sequence as set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, 
SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID N0.13, SEQ ID N0.15, SEQ ID 
NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ED NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ED 

10 NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ED NO:47, SEQ ED NO:49, 
SEQ ID NO:51, SEQ ED NO:53, SEQ ED NO:55, SEQ ID NO:57, SEQ ED NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ED NO:67, SEQ ED NO:69, SEQ ED NO:71, 
SEQ ID NO:73, SEQ ED NO:75, SEQ ED NO:77, SEQ ID NO:79, SEQ ED NO:81, SEQ ED 
NO:83, SEQ ED NO:85, SEQ ID NO:87, SEQ ED NO:89, SEQ ED NO:91, SEQ ED NO:93, 

15 SEQ ID NO:95, SEQ ED NO:97, SEQ ED NO:99, SEQ ID NO:101 , SEQ ED NO:103, SEQ ED 
NO:105, SEQ ED NO:107, SEQ ID NO:109, SEQ EDNO.lll, SEQ ED NO.l 13, SEQ ID 
NO:l 15, SEQ ED NO:l 17, SEQ ID NO:119, SEQ ED NO:121, SEQ ID NO: 123, SEQ ID 
NO:125, SEQ ED NO:127, SEQ ED NO:129, SEQ ED NO:131, SEQ ED NO:133, SEQ ID 
NO:135, SEQ ED NO:137, SEQ ID NO:139, SEQ ED NO:141, SEQ ED NO:143, SEQ ID 

20 NO:145, SEQ ED NO:147, SEQ ID NO:149, SEQ ED NO:151, SEQ ED NO:153, SEQ ID 
NO.155, SEQ ED NO:157, SEQ ID NO:199, SEQ ED NO.161, SEQ ED NO:163, SEQ ID 
NO:165, SEQ ED NO:167, SEQ ID N0:169, SEQ EO NO:171, SEQ ED NO:173, SEQ ID 
NO:175, SEQ ED NO:177, SEQ ID NO:179, SEQ ED N0.181, SEQ ED NO.183, SEQ ID 
NO:185, SEQ ED NO:187, SEQ ID NO:189, SEQ ED NO:191, SEQ ED NO:193, SEQ ID 

25 NO:195, SEQ ED NO:197, SEQ ID NO:199, SEQ ED NO:201, SEQ ED NO:203, SEQ ID 
NO:205, SEQ ED NO:207, SEQ ID NO:209, SEQ ED NO:21 1, SEQ ED NO:213, SEQ ED 
NO:215, SEQEDNO:217, SEQ 1DN0:219, SEQEDNO:221, SEQEDNO:223, SEQ ID 
NO:225, SEQ ED NO:227, SEQ ID NO:229, SEQ ED NO:231, SEQ ED NO:233, SEQ ID 
NO:235, SEQ ED NO:237, SEQ ID NO:239, SEQ ED NO:241, SEQ ED NO:243, SEQ ED 

30 NO:245, SEQ ED NO:247, SEQ ID NO:249, SEQ ED NO:251, SEQ ED NO:253, SEQ ID 
NO:255, SEQ ED NO:257, SEQ ID NO:259, SEQ ED NO:261, SEQ ED NO:263, SEQ ID 
NO:265, SEQ ED NO:267, SEQ ID NO:269, SEQ ED NO:271, SEQ ED NO:273, SEQ ID 
NO:275, SEQ ED NO:277, SEQ ID NO:279, SEQ ED NO:281, SEQ ED NO:283, SEQ ED 
NO:285, SEQ ED NO:287, SEQ ID NO:289, SEQ ED NO:291, SEQ ED NO:293, SEQ ID 
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NO:295, SEQ ID NO:297, SEQ ID NO:299, SEQ ID NO:301, SEQ ID NO:303, SEQ ID 
NO:305, SEQ ID NO.307, SEQ ID NO:309, SEQ ID N0:311, SEQ ID NO:313, SEQ ID 
NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ID NO:321, SEQ ID NO:323, SEQ ID 
NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ID NO:331, SEQ ID NO:333, SEQ ID 
NO:335, SEQ ID NO:337, SEQ ID NO:339, SEQ ID NO:341, SEQ ID NO:343, SEQ ID 
NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ID NO:351, SEQ ID NO:353, SEQ ID 
NO:355, SEQ ID NO:357, SEQ ID NO:359, SEQ ID NO:361, SEQ ID NO:363, SEQ ID 
NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ID NO:371, SEQ ID NO:373, SEQ ID 
NO:375, SEQ ID NO:377 or SEQ ID NO:379, or a subsequence thereof. 



128. A method for isolating or recovering a nucleic acid encoding a 
polypeptide with a xylanase activity from an environmental sample comprising the steps of: 

(a) providing a polynucleotide probe comprising a sequence as set forth in 

claim 1 or claim 24, or a subsequence thereof; 
15 (b) isolating a nucleic acid from the environmental sample or treating the 

environmental sample such that nucleic acid in the sample is accessible for hybridization to a 

polynucleotide probe of step (a); 

(c) combining the isolated nucleic acid or the treated environmental sample of 

step (b) with the polynucleotide probe of step (a); and 
20 (d) isolating a nucleic acid that specifically hybridizes with the polynucleotide 

probe of step (a), thereby isolating or recovering a nucleic acid encoding a polypeptide with a 

xylanase activity from an environmental sample. 

129. The method of claim 127 or claim 128, wherein the environmental 

25 sample comprises a water sample, a liquid sample, a soil sample, an air sample or a biological 
sample. 

130. The method of claim 129, wherein the biological sample is derived 
from a bacterial cell, a protozoan cell, an insect cell, a yeast cell, a plant cell, a fungal cell or 

30 a mammalian cell. 

131. A method of generating a variant of a nucleic acid encoding a 
polypeptide with a xylanase activity comprising the steps of: 
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(a) providing a template nucleic acid comprising a sequence as set forth in 
claim 1 or claim 24; and 

(b) modifying, deleting or adding one or more nucleotides in the template 
sequence, or a combination thereof, to generate a variant of the template nucleic acid. 

5 

132. The method of claim 131, further comprising expressing the variant 
nucleic acid to generate a variant xylanase polypeptide. 

1 33 . The method of claim 131, wherein the modifications, additions or 
10 deletions are introduced by a method comprising error-prone PCR, shuffling, 

oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo 
mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble 
mutagenesis, site-specific mutagenesis, gene reassembly, gene site saturated mutagenesis 
(GSSM™), synthetic ligation reassembly (SLR) and a combination thereof. 

15 

1 34. The method of claim 131, wherein the modifications, additions or 
deletions are introduced by a method comprising recombination, recursive sequence 
recombination, phosphothioate-modified DNA mutagenesis, uracil-containing template 
mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair- 

20 deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion 
mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, artificial 
gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation and a 
combination thereof. 

25 135. The method of claim 131, wherein the method is iteratively repeated 

until a xylanase having an altered or different activity or an altered or different stability from 
that of a polypeptide encoded by the template nucleic acid is produced. 

136. The method of claim 135, wherein the variant xylanase polypeptide is 
30 thermotolerant, and retains some activity after being exposed to an elevated temperature. 

137. The method of claim 135, wherein the variant xylanase polypeptide 
has increased glycosylation as compared to the xylanase encoded by a template nucleic acid. 
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138. The method of claim 135, wherein the variant xylanase polypeptide 
has a xylanase activity under a high temperature, wherein the xylanase encoded by the 
template nucleic acid is not active under the high temperature. 

5 139. The method of claim 131, wherein the method is iteratively repeated 

until a xylanase coding sequence having an altered codon usage from that of the template 
nucleic acid is produced. 

140. The method of claim 131, wherein the method is iteratively repeated 
10 until a xylanase gene having higher or lower level of message expression or stability from 

that of the template nucleic acid is produced. 

141 . A method for modifying codons in a nucleic acid encoding a 
polypeptide with a xylanase activity to increase its expression in a host cell, the method 

1 5 comprising the following steps: 

(a) providing a nucleic acid encoding a polypeptide with a xylanase activity 
comprising a sequence as set forth in claim 1 or claim 24; and, 

(b) identifying a non-preferred or a less preferred codon in the nucleic acid of 
step (a) and replacing it with a preferred or neutrally used codon encoding the same amino 

20 acid as the replaced codon, wherein a preferred codon is a codon over-represented in coding 
sequences in genes in the host cell and a non-preferred or less preferred codon is a codon 
under-represented in coding sequences in genes in the host cell, thereby modifying the 
nucleic acid to increase its expression in a host cell. 

25 142. A method for modifying codons in a nucleic acid encoding a xylanase 

polypeptide, the method comprising the following steps: 

(a) providing a nucleic acid encoding a polypeptide with a xylanase activity 
comprising a sequence as set forth in claim 1 or claim 24; and, 

(b) identifying a codon in the nucleic acid of step (a) and replacing it with a 
30 different codon encoding the same amino acid as the replaced codon, thereby modifying 

codons in a nucleic acid encoding a xylanase. 
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1 43 . A method for modifying codons in a nucleic acid encoding a xylanase 
polypeptide to increase its expression in a host cell, the method comprising the following 
steps: 

(a) providing a nucleic acid encoding a xylanase polypeptide comprising a 
5 sequence as set forth in claim 1 or claim 24; and, 

(b) identifying a non-preferred or a less preferred codon in the nucleic acid of 
step (a) and replacing it with a preferred or neutrally used codon encoding the same amino 
acid as the replaced codon, wherein a preferred codon is a codon over-represented in coding 
sequences in genes in the host cell and a non-preferred or less preferred codon is a codon 

1 0 under-represented in coding sequences in genes in the host cell, thereby modifying the 
nucleic acid to increase its expression in a host cell. 

1 44. A method for modifying a codon in a nucleic acid encoding a 
polypeptide having a xylanase activity to decrease its expression in a host cell, the method 

1 5 comprising the following steps: 

(a) providing a nucleic acid encoding a xylanase polypeptide comprising a 
sequence as set forth in claim 1 or claim 24; and 

(b) identifying at least one preferred codon in the nucleic acid of step (a) and 
replacing it with a non-preferred or less preferred codon encoding the same amino acid as the 

20 replaced codon, wherein a preferred codon is a codon over-represented in coding sequences 
in genes in a host cell and a non-preferred or less preferred codon is a codon under- 
represented in coding sequences in genes in the host cell, thereby modifying the nucleic acid 
to decrease its expression in a host cell. 

25 145. The method of claim 1 44, wherein the host cell is a bacterial cell, a 

fungal cell, an insect cell, a yeast cell, a plant cell or a mammalian cell. 

146. A method for producing a library of nucleic acids encoding a plurality 
of modified xylanase active sites or substrate binding sites, wherein the modified active sites 
30 or substrate binding sites are derived from a first nucleic acid comprising a sequence 
encoding a first active site or a first substrate binding site the method comprising the 
following steps: 

(a) providing a first nucleic acid encoding a first active site or first substrate 
binding site, wherein the first nucleic acid sequence comprises a sequence that hybridizes 
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under stringent conditions to a sequence as set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID N0.13, SEQ ID NO: 15, SEQ 
ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, 
SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, 
SEQ ID NO:73, SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:79, SEQ ID NO:81, SEQ ID 
NO:83, SEQ ID NO:85, SEQ ID NO:87, SEQ ID NO:89, SEQ ID NO:91, SEQ ID NO:93, 
SEQ ID NO:95, SEQ ID NO:97, SEQ ID NO:99, SEQ ID NO: 101, SEQ ED NO:103, SEQ ID 
NO: 105, SEQ ID NO: 107, SEQ ID NO: 109, SEQ ID NO: 1 1 1, SEQ ID NO:l 13, SEQ ID 



NO: 1 1 5, SEQ ID NO:l 17, SEQ ID NO: 1 1 9, SEQ ID NO: 12 1 
NO:125, SEQ ID NO:127, SEQ ID NO:129, SEQ ID NO:131 
NO:135, SEQ ID NO:137, SEQ ID NO:139, SEQ ID NO:141 
NO:145, SEQ ID NO:147, SEQ ID NO:149, SEQ ID NO:151 
N0.155, SEQ ID NO:157, SEQ ID NO:199, SEQ ID NO:161 
NO:165, SEQ ID NO:167, SEQ ID NO:169, SEQ ID NO:171 
NO:175, SEQ ID NO:177, SEQ ID NO:179, SEQ ID N0.181 
NO:185, SEQ IDNO:187, SEQ ID NO:189, SEQ ID NO:191 
NO:195, SEQ ID NO:197, SEQ ID NO:199, SEQ ID NO:201 
NO:205, SEQ ID NO:207, SEQ ID NO:209, SEQ ID NO:21 1 
NO:215, SEQ ID NO:217, SEQ ID NO:219, SEQ ID NO:221 
NO:225, SEQ ID NO:227, SEQ ID NO:229, SEQ ID NO:231 
NO:235, SEQ ID NO:237, SEQ ID NO:239, SEQ ID NO:241 
NO:245, SEQ ID NO:247, SEQ ID NO:249, SEQ ID NO:251 
NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:261 
NO:265, SEQ ID NO:267, SEQ ID N0.269, SEQ ID NO:271 
NO:275, SEQ ID NO:277, SEQ ID NO:279, SEQ ID NO:281 
NO:285, SEQ ID NO:287, SEQ ID NO:289, SEQ ID NO:291 
N0.295, SEQ ID N0.297, SEQ ID NO:299, SEQ ID NO.301 
NO:305, SEQ ID NO:307, SEQ ID NO:309, SEQ ID NO:31 1 
NO:315, SEQ ID NO:317, SEQ ID NO:319, SEQ ID N0.321 
NO:325, SEQ ID NO:327, SEQ ID NO:329, SEQ ID NO:331 
N0.335, SEQ ED NO:337, SEQ ID NO:339, SEQ ID N0.341 



SEQ ID NO: 123, SEQ ED 
SEQ ID NO: 133, SEQ ED 
SEQBDNO:143,SEQID 
SEQDDNO:153, SEQ ID 
SEQDDNO:163, SEQ ID 
SEQ ID NO: 173, SEQ ID 
SEQ ID NO: 183, SEQ ID 
SEQE)NO:193,SEQID 
SEQEDNO:203, SEQ ID 
SEQEDNO:213,SEQID 
SEQDDNO:223,SEQID 
SEQDDNO:233,SEQID 
SEQEDNO:243, SEQ ID 
SEQEDNO:253, SEQ ID 
SEQIDNO:263,SEQED 
SEQEDNO:273,SEQID 
SEQEDNO:283,SEQID 
SEQEDNO:293,SEQID 
SEQEDNO:303, SEQ ED 
SEQIDNOSIS, SEQ ID 
SEQ ID NO:323, SEQ ID 
SEQIDNO:333, SEQ ID 
SEQ ED NO:343, SEQ ID 
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NO:345, SEQ ID NO:347, SEQ ID NO:349, SEQ ID NO:351, SEQ ID NO:353, SEQ ID 
NO:355, SEQ ID NO:357, SEQ ID NO:359, SEQ ID NO:361, SEQ ID NO:363, SEQ ID 
NO:365, SEQ ID NO:367, SEQ ID NO:369, SEQ ID NO:371, SEQ ID NO:373, SEQ ID 
NO:375, SEQ ID NO:377 or SEQ ID NO:379, or a subsequence thereof, and the nucleic acid 
5 encodes a xylanase active site or a xylanase substrate binding site; 

(b) providing a set of mutagenic oligonucleotides that encode naturally- 
occurring amino acid variants at a plurality of targeted codons in the first nucleic acid; and, 

(c) using the set of mutagenic oligonucleotides to generate a set of active site- 
encoding or substrate binding site-encoding variant nucleic acids encoding a range of amino 

10 acid variations at each amino acid codon that was mutagenized, thereby producing a library 
of nucleic acids encoding a plurality of modified xylanase active sites or substrate binding 
sites. 



147. The method of claim 145, comprising mutagenizing the first nucleic 
15 acid of step (a) by a method comprising an optimized directed evolution system, gene site- 
saturation mutagenesis (GSSM™), or a synthetic ligation reassembly (SLR). 



148. The method of claim 145, comprising mutagenizing the first nucleic 
acid of step (a) or variants by a method comprising error-prone PCR, shu f fli n g, 
20 oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo 
mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble 
mutagenesis, site-specific mutagenesis, gene reassembly, gene site saturated mutagenesis 
(GSSM™), synthetic ligation reassembly (SLR) and a combination thereof. 

25 149. The method of claim 145, comprising mutagenizing the first nucleic 

acid of step (a) or variants by a method comprising recombination, recursive sequence 
recombination, phosphothioate-modified DNA mutagenesis, uracil-containing template 
mutagenesis, gapped duplex mutagenesis, point mismatch repair mutagenesis, repair- 
deficient host strain mutagenesis, chemical mutagenesis, radiogenic mutagenesis, deletion 

30 mutagenesis, restriction-selection mutagenesis, restriction-purification mutagenesis, artificial 
gene synthesis, ensemble mutagenesis, chimeric nucleic acid multimer creation and a 
combination thereof. 



150. A method for making a small molecule comprising the following steps: 
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(a) providing a plurality of biosynthetic enzymes capable of synthesizing or 
modifying a small molecule, wherein one of the enzymes comprises a xylanase enzyme 
encoded by a nucleic acid comprising a sequence as set forth in claim 1 or claim 24; 

(b) providing a substrate for at least one of the enzymes of step (a); and 

5 (c) reacting the substrate of step (b) with the enzymes under conditions that 

facilitate a plurality of biocatalytic reactions to generate a small molecule by a series of 
biocatalytic reactions. 

151. A method for modifying a small molecule comprising the following 

10 steps: 

(a) providing a xylanase enzyme, wherein the enzyme comprises a polypeptide 
as set forth in claim 64, or a polypeptide encoded by a nucleic acid comprising a nucleic acid 
sequence as set forth in claim 1 or claim 24; 

(b) providing a small molecule; and 

1 5 (c) reacting the enzyme of step (a) with the small molecule of step (b) under 

conditions that facilitate an enzymatic reaction catalyzed by the xylanase enzyme, thereby 
modifying a small molecule by a xylanase enzymatic reaction. 

152. The method of claim 151, comprising a plurality of small molecule . 
20 substrates for the enzyme of step (a), thereby generating a library of modified small 

molecules produced by at least one enzymatic reaction catalyzed by the xylanase enzyme. 

153. The method of claim 151, further comprising a plurality of additional 
enzymes under conditions that facilitate a plurality of biocatalytic reactions by the enzymes 

25 to form a library of modified small molecules produced by the plurality of enzymatic 
reactions. 

1 54. The method .of claim 1 53, further comprising the step of testing the 
library to determine if a particular modified small molecule which exhibits a desired activity 

30 is present within the library. 

155. The method of claim 154, wherein the step of testing the library further 
comprises the steps of systematically eliminating all but one of the biocatalytic reactions used 
to produce a portion of the plurality of the modified small molecules within the library by 
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testing the portion of the modified small molecule for the presence or absence of the 
particular modified small molecule with a desired activity, and identifying at least one 
specific biocatalytic reaction that produces the particular modified small molecule of desired 
activity. 

5 

1 56. A method for detennining a functional fragment of a xylanase enzyme 
comprising the steps of: 

(a) providing a xylanase enzyme, wherein the enzyme comprises a polypeptide 
as set forth in claim 64, or a polypeptide encoded by a nucleic acid as set forth in claim 1 or 

10 claim 24; and 

(b) deleting a plurality of amino acid residues from the sequence of step (a) 
and testing the remaining subsequence for a xylanase activity, thereby determining a 
functional fragment of a xylanase enzyme. 

15 157. The method of claim 156, wherein the xylanase activity is measured by 

providing a xylanase substrate and detecting a decrease in the amount of the substrate or an 
increase in the amount of a reaction product. 

158. A method for whole cell engineering of new or modified phenotypes 
20 by using real-time metabolic flux analysis, the method comprising the following steps: 

(a) making a modified cell by modifying the genetic composition of a cell, 
wherein the genetic composition is modified by addition to the cell of a nucleic acid 
comprising a sequence as set forth in claim 1 or claim 24; 

(b) culturing the modified cell to generate a plurality of modified cells; 

25 (c) measuring at least one metabolic parameter of the cell by monitoring the 

cell culture of step (b) in real time; and, 

(d) analyzing the data of step (c) to determine if the measured parameter 
differs from a comparable measurement in an unmodified cell tinder similar conditions, 
thereby identifying an engineered phenotype in the cell using real-time metabolic flux 

30 analysis. 

159. The method of claim 158, wherein the genetic composition of the cell 
is modified by a method comprising deletion of a sequence or modification of a sequence in 
the cell, or, knocking out the expression of a gene. 
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1 60. The method of claim 158, further comprising selecting a cell 
comprising a newly engineered phenotype. 

5 161. The method of claim 1 60, further comprising culturing the selected 

cell, thereby generating a new cell strain comprising a newly engineered phenotype. 

162. An isolated or recombinant signal sequence consisting of a sequence 
as set forth in residues 1 to 14, 1 to 15, 1 to 16, 1 to 17, 1 to 18, 1 to 19, 1 to 20, 1 to 21, 1 to 

10 22, 1 to 23, 1 to 24, 1 to 25, 1 to 26, 1 to 27, 1 to 28, 1 to 28, 1 to 30, 1 to 31, 1 to 32, 1 to 33, 
1 to 34, 1 to 35, 1 to 36, 1 to 37, 1 to 38, 1 to 40, 1 to 41, 1 to 42, 1 to 43 or 1 to 44, ofSEQ 
ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, 
SEQ ID NO:14, SEQ ID NO.16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, 

15 SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID 
NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, 
SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID 
NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, 
SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, SEQ ID NO:88, SEQ ID 

20 NO:90, SEQ ID NO:92, SEQ ID NO:94, SEQ ID NO:96, SEQ ID NO:98, SEQ ID NO: 100, 
SEQ ID NO.-102, SEQ ID NO:104, SEQ ID NO:106, SEQ ID NO:108, SEQ ID NO:l 10, 
SEQ ID NO:l 12, SEQ ID NO:l 14, SEQ ID NO.116, SEQ ID NO.l 18, SEQ ID NO:120, 
SEQ ID NO:122, SEQ ID NO:124, SEQ ID NO:126, SEQ ID NO:128, SEQ ID NO:130, 
SEQ ID NO:132; SEQ ID N0.134; SEQ ID NO:136; SEQ ID NO.138; SEQ ID NO:140; 

25 SEQ ID NO:142; SEQ ID NO:144; N0.146, SEQ ID N0.148, SEQ ID NO:150, SEQ ID 
NO:152, SEQ ID NO:154, SEQ ID NO:156, SEQ ID NO:158, SEQ ID NO.160, SEQ ID 
N0.162, SEQ ID NO:164, SEQ ID NO:166, SEQ ID NO:168, SEQ ID NO:170, SEQ ID 
N0.172, SEQ ID NO:174, SEQ ID NO:176, SEQ ID NO:178, SEQ ID NO:180, SEQ ID 
NO:182, SEQ ID NO:184, SEQ ID N0.186, SEQ ID NO:188, SEQ ID NO.190, SEQ ID 

30 NO:192, SEQ ID NO:194, SEQ ID NO:196, SEQ ID NO:198, SEQ ID NO:200, SEQ ID 
NO:202, SEQ ID NO:204, SEQ ID NO:206, SEQ ID NO:208, SEQ ID NO:210, SEQ ID 
NO:212, SEQ ID NO:214, SEQ ID NO:216, SEQ ID NO:218, SEQ ID NO:220, SEQ ID 
NO:222, SEQ ID NO:224, SEQ ID NO:226, SEQ ID NO:228, SEQ ID NO:230, SEQ ID 
NO:232, SEQ ID NO:234, SEQ ID NO:236, SEQ ID NO:238, SEQ ID NO:240, SEQ ID 
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NO:242, SEQ ID NO:244, SEQ ID NO:246, SEQ ID NO:248, SEQ ID NO:250, SEQ ID 
NO:252, SEQ ID NO:254, SEQ ID NO:256, SEQ ID NO:258, SEQ ID NO:260, SEQ ID 
NO:262, SEQ ID NO:264, SEQ ID NO:266, SEQ ID NO;268, SEQ ID NO:270, SEQ ID 
NO:272, SEQ ID NO:274, SEQ ID NO:276, SEQ ID NO:278, SEQ ID NO:280, SEQ ID 
5 NO:282, SEQ ID NO:284, SEQ ID NO:286, SEQ ID NO:288, SEQ ID NO:290, SEQ ID 
NO:292, SEQ ID NO:294, SEQ ED NO:296, SEQ ID NO:298, SEQ ID NO:300, SEQ ID 
NO:302, SEQ ID NO:304, SEQ ID NO:306, SEQ ID NO:308, SEQ ID NO:310, SEQ ID 
NO:312, SEQ ID NO:314, SEQ ID NO:316, SEQ ID NO:318, SEQ ID NO:320, SEQ ID 
NO:322, SEQ ID NO:324, SEQ ID NO:326, SEQ ID NO:328, SEQ ID NO:330, SEQ ID 

10 NO:332, SEQ ID NO:334, SEQ ID NO:336, SEQ ID NO:338, SEQ ID NO:340, SEQ ID 
NO:342, SEQ ID NO:344, SEQ ID NO:346, SEQ ID NO:348, SEQ ID NO:350, SEQ ID 
NO:352, SEQ ID NO:354, SEQ ID NO:356, SEQ ID NO:358, SEQ ID NO:360 3 SEQ ID 
NO:362, SEQ E) NO:364, SEQ ID NO:366, SEQ ID NO:368, SEQ ID NO:370, SEQ ID 
NO:372, SEQ ID NO:374, SEQ ID NO:376, SEQ ID NO:378 or SEQ ID NO:380; or, 

1 5 consisting of a sequence as set forth in Table 4. 

163. A chimeric polypeptide comprising at least a first domain comprising 
signal peptide (SP) having a sequence as set forth in claim 162, and at least a second domain 
comprising a heterologous polypeptide or peptide, wherein the heterologous polypeptide or 

20 peptide is not naturally associated with the signal peptide (SP). 

1 64. The chimeric polypeptide of claim 1 63, wherein the heterologous 
polypeptide or peptide is not a xylanase. 

25 165. The chimeric polypeptide of claim 163, wherein the heterologous 

polypeptide or peptide is amino terminal to, carboxy terminal to or on both ends of the signal 
peptide (SP) or a xylanase catalytic domain (CD). 

1 66. An isolated or recombinant nucleic acid encoding a chimeric 
30 polypeptide, wherein the chimeric polypeptide comprises at least a first domain comprising 
signal peptide (SP) having a sequence as set forth in claim 162 and at least a second domain 
comprising a heterologous polypeptide or peptide, wherein the heterologous polypeptide or 
peptide is not naturally associated with the signal peptide (SP). 
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167. A method of increasing thermotolerance or thermostability of a 
xylanase polypeptide, the method comprising glycosylating a xylanase, wherein the 
polypeptide comprises at least thirty contiguous amino acids of a polypeptide as set forth in 
claim 60, or a polypeptide encoded by a nucleic acid as set forth in claim 1 or claim 24, 

5 thereby increasing the thermotolerance or thermostability of the xylanase. 

168. A method for overexpressing a recombinant xylanase in a cell 
comprising expressing a vector comprising a nucleic acid sequence as set forth in claim 1 or 
claim 24, wherein o verexpression is effected by use of a high activity promoter, a dicistronic 

1 0 vector or by gene amplification of the vector. 

169. A method of making a transgenic plant comprising the following steps : 

(a) introducing a heterologous nucleic acid sequence into the cell, wherein the 
heterologous nucleic sequence comprises a sequence as set forth in claim 1 or claim 24, 

1 5 thereby producing a transformed plant cell; 

(b) producing a transgenic plant from the transformed cell. 

170. The method as set forth in claim 169, wherein the step (a) further 
comprises introducing the heterologous nucleic acid sequence by electroporation or 

20 microinjection of plant cell protoplasts. 

171. The method as set forth in claim 1 69, wherein the step (a) comprises 
introducing the heterologous nucleic acid sequence directly to plant tissue by DNA particle 
bombardment or by using an Agrobacterium twnefaciens host. 

25 

1 72. A method of expressing a heterologous nucleic acid sequence in a 
plant cell comprising the following steps: 

(a) transforming the plant cell with a heterologous nucleic acid sequence 
operably linked to a promoter, wherein the heterologous nucleic sequence comprises a 

30 sequence as set forth in claim 1 or claim 24; 

(b) growing the plant under conditions wherein the heterologous nucleic acids 
sequence is expressed in the plant cell. 
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173. A method for hydrolyzing, breaking up or disrupting a xylan- 
comprising composition comprising the following steps: 

(a) providing a polypeptide having a xylanase activity as set forth in claim 64, 
or a polypeptide encoded by a nucleic acid as set forth in claim 1 or claim 24; 
5 (b) providing a composition comprising a xylan; and 

(c) contacting the polypeptide of step (a) with the composition of step (b) 
under conditions wherein the xylanase hydrolyzes, breaks up or disrupts the xylan- 
comprising composition. 

10 174. The method as set forth in claim 1 73, wherein the composition 

comprises a plant cell, a bacterial cell, a yeast cell, an insect cell, or an animal cell. 

175. A dough or a bread product comprising a polypeptide as set forth in 

claim 64. 

15 

1 76. A method of dough conditioning comprising contacting a dough or a 
bread product with at least one polypeptide as set forth in claim 64 under conditions 
sufficient for conditioning the dough. 

20 1 77. A beverage comprising a polypeptide as set forth in claim 64. 

178. A method of beverage production comprising administration of at least 
one polypeptide as set forth in claim 64 to a beverage or a beverage precursor under 
conditions sufficient for decreasing the viscosity of the beverage. 

25 

179. The method of claim 178, wherein the beverage or beverage precursor 
is a wort or a beer. 

1 80. A food, a feed or a nutritional supplement comprising a polypeptide as 
30 set forth in claim 64. 

1 8 L A method for utilizing a xylanase as a nutritional supplement in an 
animal diet, the method comprising: 
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preparing a nutritional supplement containing a xylanase enzyme comprising 
at least thirty contiguous amino acids of a polypeptide as set forth in claim 64; and 

administering the nutritional supplement to an animal to increase utilization of 
a xylan contained in a feed or a food ingested by the animal. 



5 



182. 



The method of claim 1 81, wherein the animal is a human. 



183. 



The method of claim 181, wherein the animal is a human. 



10 



184. 



The method of claim 181, wherein the animal is a ruminant or a 



monogastric animal. 

1 85 . The method of claim 181, wherein the xylanase enzyme is prepared by 
expression of a polynucleotide encoding the xylanase in an organism selected from the group 

15 consisting of a bacterium, a yeast, a plant, an insect, a fungus and an animal. 

1 86. The method of claim 1 85, wherein the organism is selected from the 
group consisting of an S. pombe, S. cerevisiae, Pickia pastoris, Pseudomonas sp., E. coli, 
Streptomyces sp., Bacillus sp. and Lactobacillus sp. 



1 87. An edible enzyme delivery matrix comprising a thermostable 
recombinant xylanase enzyme. 

1 88. The edible enzyme delivery matrix of claim 187 comprising a 
25 polypeptide as set forth in claim 64. 

1 89. A method for delivering a xylanase supplement to an animal, the 
method comprising: 

preparing an edible enzyme delivery matrix in the form of pellets comprising a 
30 granulate edible carrier and a thermostable recombinant xylanase enzyme, wherein the pellets 
readily disperse the xylanase enzyme contained therein into aqueous media, and 
administering the edible enzyme delivery matrix to the animal. 



20 
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190. The method of claim 1 89, wherein the recombinant xylanase enzyme 
comprises a polypeptide as set forth in claim 64. 



191. The method of claim 189, wherein the granulate edible carrier 

5 comprises a carrier selected from the group consisting of a grain germ, a grain germ that is 
spent of oil, a hay, an alfalfa, a timothy, a soy hull, a sunflower seed meal and a wheat midd. 

192. The method of claim 189, wherein the edible carrier comprises grain 
germ that is spent of oil. 

10 

193. The method of claim 189, wherein the xylanase enzyme is 
glycosylated to provide thermostability at pelletizing conditions. 

194. The method of claim 189, wherein the delivery matrix is formed by 
1 5 pelletizing a mixture comprising a grain germ and a xylanase. 

195. The method of claim 189, wherein the pelletizing conditions include 
application of steam. 

20 196. The method of claim 1 89, wherein the pelletizing conditions comprise 

application of a temperature in excess of about 80°C for about 5 minutes and the enzyme 
retains a specific activity of at least 350 to about 900 units per milligram of enzyme. 

197. An isolated or recombinant nucleic acid comprising a sequence 

25 encoding a polypeptide having a xylanase activity and a signal sequence, wherein the nucleic 
acid comprises a sequence as set forth in claim 1 . 

198. The isolated or recombinant nucleic acid of claim 197, wherein the 
signal sequence is derived from another xylanase or a non-xylanase enzyme. 

30 

199. An isolated or recombinant nucleic acid comprising a sequence 
encoding a polypeptide having a xylanase activity, wherein the sequence does not contain a 
signal sequence and the nucleic acid comprises a sequence as set forth in claim 1. 
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200. An isolated or recombinant nucleic acid comprising a sequence as set 
forth in SEQ ID NO: 1 89, wherein SEQ ID NO: 1 89 contains one or more of the following 
mutations: the nucleotides at positions 22 to 24 are TTC, the nucleotides at positions 31 to 33 
are CAC, the nucleotides at positions 34 to 36 are TTG, the nucleotides at positions 49 to 51 

5 are ATA, the nucleotides at positions 3 1 to 33 are CAT, the nucleotides at positions 67 to 69 
are ACG, the nucleotides at positions 178 to 180 are CAC, the nucleotides at positions 190 to 
192 are TGT, the nucleotides at positions 190 to 192 are GTA, the nucleotides at positions 
190 to 192 are GTT, the nucleotides at positions 193 to 195 are GTG, the nucleotides at 
positions 202 to 204 are GCT, the nucleotides at positions 235 to 237 are CCA, or the 
1 0 nucleotides at positions 235 to 237 are CCC. 

201 . A method for making a nucleic acid comprising a sequence as set forth 
in claim 200, wherein the mutations in SEQ ID NO: 1 89 are obtained by gene site saturated 
mutagenesis (GSSM™). 

15 

202. An isolated or recombinant polypeptide comprising an amino acid 
sequence comprising SEQ ID NO: 190, wherein SEQ ID NO: 190 contains one or more of 
the following mutations: the aspartic acid at amino acid position 8 is phenylalanine, the 
glutamine at amino acid position 1 1 is histidine, the asparagine at amino acid position 12 is 

20 leucine, the glycine at amino acid position 17 is isoleucine, the threonine at amino acid' 
position 23 is threonine encoded by a codon other than the wild type codon, the glycine at 
amino acid position 60 is histidine, the proline at amino acid position 64 is cysteine, the 
proline at amino acid position 64 is valine, the serine at amino acid position 65 is valine, the 
glycine at amino acid position 68 is isoleucine, the glycine at amino acid position 68 is 

25 alanine, or the valine at amino acid position 79 is proline. 

203. A method for reducing lignin in a wood or wood product comprising 
contacting the wood or wood product with a polypeptide as set forth in claim 64. 

30 204. A detergent composition comprising a polypeptide as set forth in claim 

64. 



205. A pharmaceutical composition comprising a polypeptide as set forth in 

claim 64. 
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206. A method for eliminating or protecting animals from a microorganism 
comprising a xylan comprising administering a polypeptide as set forth in claim 64. 

5 207. The method of claim 206, wherein the microorganism is a bacterium. 

208 . The method of claim 205, wherein the bacterium is a salmonellae. 

209. An isolated or recombinant nucleic acid comprising SEQ ID NO: 1 89, 
10 wherein SEQ ID NO: 189 comprises one or more or all of the following sequence variations: 

the nucleotides at positions 22 to 24 are TTC, the nucleotides at positions 22 to 24 are TTT, 
the nucleotides at positions 31 to 33 are CAC, the nucleotides at positions 31 to 33 are CAT, 
the nucleotides at positions 34 to 36 are TTG, the nucleotides at positions 34 to 36 are TTA, 
the nucleotides at positions 34 to 36 are CTC, the nucleotides at positions 34 to 36 are CTT, 

15 the nucleotides at positions 34 to 36 are CTA, the nucleotides at positions 34 to 36 are CTG, 
the nucleotides at positions 49 to 51 are ATA, the nucleotides at positions 49 to 51 are ATT, 
the nucleotides at positions 49 to 51 are ATC, the nucleotides at positions 178 to 180 are 
CAC, the nucleotides at positions 178 to 180 are CAT, the nucleotides at positions 190 to 192 
are TGT, the nucleotides at positions 190 to 192 are TGC, the nucleotides at positions 190 to 

20 192 are GTA, the nucleotides at positions 190 to 192 are GTT, the nucleotides at positions 
190 to 192 are GTC, the nucleotides at positions 190 to 192 are GTG, the nucleotides at 
positions 193 to 195 are GTG, the nucleotides at positions 193 to 195 are GTC, the 
nucleotides at positions 193 to 195 are GTA, the nucleotides at positions 193 to 195 are GTT, 
the nucleotides at positions 202 to 204 are ATA, the nucleotides at positions 202 to 204 are 

25 ATT, the nucleotides at positions 202 to 204 are ATC, the nucleotides at positions 202 to 204 
are GCT, the nucleotides at positions 202 to 204 are GCG, the nucleotides at positions 202 to 
204 are GCC, the nucleotides at positions 202 to 204 are GCA, the nucleotides at positions 
235 to 237 are CCA, the nucleotides at positions 235 to 237 are CCC, or the nucleotides at 
positions 235 to 237 are CCG. 

30 

210. An isolated or recombinant polypeptide comprising an amino acid 
sequence comprising SEQ ID NO: 190, wherein SEQ ID NO: 190 comprises one or more or 
all of the following sequence variations: the aspartic acid at amino acid position 8 is 
phenylalanine, the glutamine at amino acid position 1 1 is histidine, the asparagine at amino 
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acid position 12 is leucine, the glycine at amino acid position 17 is isoleucine, the threonine 
at amino acid position 23 is threonine encoded by a codon other than the wild type codon, the 
glycine at amino acid position 60 is histidine > the proline at amino acid position 64 is 
cysteine, the proline at amino acid position 64 is valine, the serine at amino acid position 65 
5 is valine, the glycine at amino acid position 68 is isoleucine, the glycine at amino acid 
position 68 is alanine, or the serine at amino acid position 79 is proline. 

211. An isolated or recombinant nucleic acid comprising SEQ ID NO: 189, 
wherein SEQ ID NO:189 comprises one or more or all sequence variations set forth in Table 

10 lor Table 2. 

212. An isolated or recombinant polypeptide encoded by the nucleic acid of 

claim 211. 

15 2 1 3 . An isolated or recombinant nucleic acid comprising SEQ ID NO:379, 

wherein SEQ ID NO:379 comprises one or more or all of the following sequence variations: 
the nucleotides at positions 22 to 24 are TTC, the nucleotides at positions 3 1 to 33 are C AC, 
the nucleotides at positions 49 to 51 are ATA, the nucleotides at positions 178 to 180 are 
CAC, the nucleotides at positions 193 to 195 are GTG, the nucleotides at positions 202 to 

20 204areGCT. 

214. An isolated or recombinant polypeptide comprising SEQ ID NO:380, 
wherein SEQ ID NO:380 comprises one or more or all of the following sequence variations: 
D8F, Ql 1H, G17I, G60H, S65V and/or G68A. 

25 

215. The isolated or recombinant polypeptide of claim 210 or claim 214, 
wherein the polypeptide has a thermostable xylanase activity. 



30 
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SEQUENCE LISTING 

<110> steer, Brian 

Call en, Walter 
Healey, shaun 
Hazlewood, Geoff 
Wu, Di 
Blum, David 
Esteghlalian, Alireza 

<120> XYLANASES, NUCLEIC ACIDS ENCODING THEM AND METHODS FOR MAKING AND USING 
THEM 

<130>09010-290001 and 09010-290WO1 

<140> not assigned 
<141> 2003-06-16 

<150> US 60/389,299 
<151> 2002-06-14 

<160> 380 

<170> FastSEQ for Windows version 4.0 

<210> 1 

<211> 1128 

<212> DNA 

<213> Bacteria 



<400> 1 

atgaccgacc 

gccatccagg 

tactgggaga 

accgagggca 

gacaagctgt 

tacttcgcct 

ggcgaggaat 

ggcgtgtacg 

gaggacggtg 

gaaaccgaat 

cgcgccgacg 

ctggtcaccg 

acgccgcacg 

ggcaacatcg 

gcgcgtctgc 

acgccggtgg 

tcgctcgccg 

cgcctgctgt 

tacgccttcg 



acaacgcttc 
cccgcctgga 
acgacgaagg 
tgagctacgc 
ggggttgggt 
ggtctgtgga 
acttcgcgat 
agtactcccg 
aaggctatcc 
ggaccgaccc 
aggccgaccg 
cctgccaccc 
tcgacgagcg 
ggctggactg 
agcgtttctt 
acgagaccgt 
cgatgcactc 
gggacacccc 
cgttcctggc 



cgaaaccagc 
gcgcaactgg 
cctggggtac 
gatgatgatc 
catgaaatac 
ccccagcggc 
ggacctgttc 
ccacgcccgc 
gatgtggaac 
gtcctaccat 
tccgttctgg 
gcagaccggc 
cgaccactgg 
cctgtggaac 
cctcgaacac 
gctgcacccg 
gcaggagccg 
gatccgcacc 
gctggcgggg 



ctgttcgaac 
tatgagatgt 
gtgatggaca 
gccgtgcagt 
atgttcatga 
gtaccgaacg 
ctggcctccg 
tcgatcctcc 
ccggagaacc 
ctgccgcact 
gcgcaggccg 
atgaaccccg 
catttctact 
ggcgtcgtgc 
gaccgcacct 
gtcggcttca 
gacgcgctcg 
ggcacgcgcc 
gagtaccgca 



agtgcggcta 
tcgaaggccc 
ccggcaacca 
acggccgcaa 
ccgagggcct 
ccgacggtcc 
cgcgatgggg 
acacctgcgt 
atctgatcaa 
tctacgaggt 
ccaaggcgag 
aatactcaaa 
ccgacgccta 
cggaactgtg 
gcgtgtatgc 
tcgccgccac 
acaacgcgat 
gctactacga 
cctggtga 



cagccgcgag 
ggacaagatt 
cgacgtgcgc 
ggacgtgttc 
gcaccagggc 
ggccccggac 
cgacggcgaa 
gcaccagggc 
gttcatcccg 
gttcgccgag 
ccgcgagtac 
ctatgatggc 
ccgcaccgcc 
cgatgcgaat 
gatcgacggc 
cgccgaaggc 
ccgctgggtg 
caacttcctc 



<210> 2 
<211> 375 
<212> PRT 
<213> Bacteria 

Met°Thr Asp His Asn Ala Ser Glu Thr Ser Leu Phe Glu Gin Cys Gly 

1 5 10 15 

Tyr Ser Arg Glu Ala He Gin Ala Arg Leu Glu Arg Asn Trp Tyr Glu 

20 25 . 3 ? 

Met Phe Glu Gly Pro Asp Lys lie Tyr Trp Glu Asn Asp Glu Gly Leu 

35 40 45 

Gly Tyr val Met Asp Thr Gly Asn His Asp Val Arg Thr Glu Gly Met 

50 55 60 

Ser Tyr Ala Met Met lie Ala Val Gin Tyr Gly Arg Lys Asp val Phe 

65 70 75 80 

Asd Lys Leu Trp Gly Trp Val Met Lys Tyr Met Phe Met Thr Glu Gly 

Page 1 
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120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1128 
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150 










155 








160 


Glu 


Asp 


Glu 


Gly 


Tyr 


Pro 


Met 


Trp 


Asn 


Pro Glu 


Asn 


His 


Leu 


He 




Phe 


He 




165 






170 








175 




Lys 


Pro 


Glu 


Thr 


Glu 


Trp 


Thr 


Asp 


Pro ser 


Tyr 


His 


Leu 


Pro 


His 


Phe 




180 








185 




190 






Tyr 


Glu 


Val 


Phe 


Ala 


Glu 


Arg 


Ala 


Asp Glu 


Ala 


Asp 


Arg 


Pro 


Phe 




195 










200 




205 




Trp 


Ala 


Gin 


Ala 


Ala 


Lys 


Ala 


ser 


Arg 


Glu Tyr 


Leu 


val 


Thr 


Ala 




210 










215 






220 










Cys 


His 


Pro 


Gin 


Thr 


Gly 


Met 


Asn 


Pro 


Glu 


Tyr ser 


Asn 


Tyr 


ASD 

r 


Gly 


225 




His 


val 




230 










235 




240 


Thr 


Pro 


Asp 


Glu 


Arg 


Asp 


His 


Trp 


His Phe 


Tyr 


ser 


Asp 


Ala 






Thr 




245 










250 






255 




Tyr 


Arg 


Ala 


Gly 


Asn 


He 


Gly 


Leu 


Asp 


Cys Leu 


Trp 


Asn 


Gly 


Val 


Val 




Glu 


260 










265 


270 




Pro 


Leu 


cys 


Asp 


Ala 


Asn 


Ala 


Arg 


Leu Gin 


Arg 


Phe 


Phe 


Leu 


Glu 


His 


275 






280 






285 








Asp 


Arg 


Thr 


Cys 


val 


Tyr 


Ala 


He 


Asp Gly 


Thr 


Pro 


val 


Asp 




290 










295 








300 








Glu 


Thr 


val 


Leu 


His 


Pro 


Val 


Gly 


Phe 


He 


Ala Ala 


Thr 


Ala 


Glu 


Gly 


305 




Ala 


Ala 




310 








315 








320 


ser 


Leu 


Met 


His 


Ser 


Gin 


Glu 


Pro 


Asp Ala 


Leu 


Asp 


Asn 


Ala 


He 








325 










330 




335 




Arg 


Trp 


val 


Arg 


Leu 


Leu 


Trp 


Asp 


Thr 


pro lie 


Arg 


Thr 


Gly 


Thr 








340 










345 






350 




Arg 


Arg 


Tyr 


Tyr 


Asp 


Asn 


Phe 


Leu 


Tyr 


Ala 


Phe Ala 


Phe 


Leu 


Ala 


Leu 






355 








360 






365 








Ala 


Gly 


Glu 


Tyr 


Arg 


Thr 


Trp 




















370 








375 



















<210> 3 
<211> 2196 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 3 

atgttcaaag aatggggaaa gaccgactcg gaaattacca caaaagtaaa caccgcgtgg 60 

aacaaactgt ttgttaacgg ggtggaatcc ggagataatg ccgaaagaat ctatgtagag 120 

actggaagcg acatggcgta catccatacc tttgacagca acgacgtgcg ctccgaagga 180 

atgtcctacg gcatgatgat gtgcgtacag atgaacgatc agacaagatt taacaaactc 240 

tggaaatggg caagaaccta tatgtacaat gaaacagacg ccggcagtaa ttccaggggc 300 

tatttctcat ggcagtgcag tacaagcggc tcaaaaatgg ataagggccc cgctcctgac 360 

ggcgaggaat actttattac ggcgctgttg ttcgcgcacg cccgctgggg gagcgcgtcc 420 

ggtactacaa acataaacaa ttacgcgcag caagcaaggc agattatcta tgacttaacg 480 

cgccgcaaac cggggaacgg agatccttac ggcgagcctt caatgtttaa tgtagacaac 540 

tatatggtta gattcgccac acttggaaat tccgccacct ttacagaccc ctcataccat 600 

ttaccggcat tctatgatgt ttgggcgctg gaattacagg cggactatga taatagtaaa 660 

ctctacggta tctgggctga taaggctgac ttgaaaaaag acattgatta ctttaaacaa 720 

gcggcgacca caagccgttc attctttgca aaaacgacaa acggtacaac cggacttgga 780 

ccggattatg ccggctttga cggaacgcct aaaaatgaag gggatcacaa gtatttcgag 840 

tatgacgcgt ggcgtatcgc gatgaacata ggtatggact acgcgtggtt cgcgaaagat 900 

tcctggcaga agacatttgc cgacagaatt caggcgttct ttgtcagcaa gggagtcact 960 

tcttacggaa accgctggac attggacggg actcaaaggg gagcggatca ctcgccgggt 1020 

cttgtcggct gtaacgcggt cgcctctctc gcggcgacaa acgcgaacgc gtggaaattt 1080 

atcgaagact tctggaacat cagcatgacg aaaggcaaat accgttacta tgacggatgt 1140 

ctgtatatga tgagcatgct gcacttaagc ggcaacttta aggcgtatct ttctacaaat 1200 

accacgcccg ccaacagttc cagcattacc ccgacaaccg cgtctttcga caagaagaca 1260 

agcgcacaag ccgacattgc cgtaacagtg acgcttaacg ggaatacatt ctcaagtatc 1320 
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acaaacaacg gtacagccct tacaagcggc acagactact cagtgagtgg 
acgataaaga aagaatacct tgcaaaacag cctgtaggaa caacgaagct 
ttcagtgccg gaggaactcc ggaacttaca gttactataa cggacacggg 
atcagcccga caaccgcgac attcgacaaa aagaccggag cgcaagccga 
accatgacgc ttaatgggaa tactttgtcg aacatcaaaa acggttctgc 
agcggaactg actactcaac gagcggcagt acggtaacga ttaaaaaaga 
aagcaggcta acggcacagt aacgcttacc ttcacattca gcgcaggcgc 
attgacatca cggtaaaaga tacaaccggc ggagcggcgg gaataaaata 
actgacaacc tgcccaacgg gtacccgaag tacagttcaa gtgatatatc 
accggaggag ctttggtaat aaccaaaacc ggaaataatt cgtccccgaa 
ccctttagtg taacaggtaa cctttccggt tatacaggca taaagataaa 
gtatccggag atttcactta taaagtattg aatgccgcaa taggttctac 
agcgtaaata acgccccaat accaaacggc tcatttggag acgtaacaat 
ggcggtacaa acaccggaga tttagatata tcgttctggc tcaataacac 
gttattgaga ttaagagcat agagctggta aaatga 

<210> 4 
<211> 711 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 4 

Met Phe Lys Glu Trp Gly Lys Thr Asp Ser Glu lie Thr 

1 5 10 

Asn Thr Ala Trp Asn Lys Leu Phe Val Asn Gly val Glu 

20 25 
Asn Ala Glu Arg lie Tyr Val Glu Thr Gly Ser Asp Met 

35 40 45 

His Thr Phe Asp ser Asn Asp val Arg ser Glu Gly Met 

50 55 60 

Met Met Met Cys val Gin Met Asn Asp Gin Thr Arg Phe 
65 70 75 

Trp Lys Trp Ala Arg Thr Tyr Met Tyr Asn Glu Thr Asp 

85 90 
Asn Ser Arg Gly Tyr Phe Ser Trp Gin Cys Ser Thr Ser 

100 105 
Met Asp Lys Gly Pro Ala Pro Asp Gly Glu Glu 

115 120 
Leu Leu Phe Ala His Ala Arg Trp Gly Ser Ala 

130 135 
He Asn Asn Tyr Ala Gin Gin Ala Arg Gin lie 
145 150 155 

Arg Arg Lys Pro Gly Asn Gly Asp Pro Tyr Gly Glu Pro 

165 170 
Asn val Asp Asn Tyr Met Val Arg Phe Ala Thr Leu Gly 

180 * 185 

Thr Phe Thr Asp Pro Ser Tyr His Leu Pro Ala 

195 200 
Ala Leu Glu Leu Gin Ala Asp Tyr Asp Asn ser 

210 215 
Trp Ala Asp Lys Ala Asp Leu Lys Lys Asp lie 
225 230 235 

Ala Ala Thr Thr ser Arg ser Phe Phe Ala Lys Thr Thr 

245 " 250 
Thr Gly Leu Gly Pro Asp Tyr Ala Gly Phe Asp Gly Thr 

260 265 
Glu Gly Asp His Lys Tyr Phe Glu Tyr Asp Ala Trp Arg 
275 280 285 

Asn lie Gly Met Asp Tyr Ala Trp Phe Ala Lys Asp ser 

290 295 300 

Thr Phe Ala Asp Arg lie Gin Ala Phe Phe Val Ser Lys 
305 310 315 

Ser Tyr Gly Asn Arg Trp Thr Leu Asp Gly Thr Gin Arg 

325 330 
His Ser Pro Gly Leu val Gly Cys Asn Ala Val Ala ser 



aacaaagtat 
cgcattcaac 
cagctccagc 
catcgccgta 
acaacttaca 
atacctggca 
ggcccaaact 
caacttcgca 
cgcgacaata 
gattacattg 
tgtaaaggga 
aaatctcggc 
accaataacc 
aaatgcttac 



Thr Lys Val 

Ser Gly Asp 
30 

Ala Tyr lie 

Ser Tyr Gly 

Asn Lys Leu 
80 

Ala Gly Ser 
95 

Gly Ser Lys 
110 

Tyr Phe He Thr Ala 
125 

ser Gly Thr Thr Asn 
140 

lie Tyr Asp Leu Thr 
160 

Ser Met Phe 

175 
Asn Ser Ala 
190 

Asp val Trp 



Phe Tyr 
205 

Lys Leu Tyr Gly He 
220 

Asp Tyr 



345 
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Phe Lys Gin 
240 

Asn Gly Thr 

255 
Pro Lys Asn 
270 

lie Ala Met 

Trp Gin Lys 

Gly val Thr 
320 

Gly Ala Asp 

335 
Leu Ala Ala 
350 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2196 
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Thr Asn Ala Asn Ala Trp Lys Phe He Glu Asp Phe Trp Asn lie Ser 

355 360 365 

Met Thr Lys Gly Lys Tyr Arg Tyr Tyr Asp Gly cys Leu Tyr Met Met 

370 ~ 375 380 

Ser Met Leu His Leu Ser Gly Asn Phe Lys Ala Tyr Leu ser Thr Asn 
385 390 395 400 

Thr Thr Pro Ala Asn Ser Ser ser lie Thr Pro Thr Thr Ala Ser Phe 

405 410 415 

Asp Lys Lys Thr Ser Ala Gin Ala Asp lie Ala Val Thr Val Thr Leu 

420 425 430 

Asn Gly Asn Thr Phe ser ser He Thr Asn Asn Gly Thr Ala Leu Thr 

435 440 445 

Ser Gly Thr Asp Tyr ser val ser Gly Thr Lys Tyr Thr lie Lys Lys 

450 455 460 

Glu Tyr Leu Ala Lys Gin Pro Val Gly Thr Thr Lys Leu Ala Phe Asn 
465 470 ' 475 480 

Phe Ser Ala Gly Gly Thr Pro Glu Leu Thr Val Thr lie Thr Asp Thr 

485 490 495 

Gly Ser Ser Ser lie Ser Pro Thr Thr Ala Thr Phe Asp Lys Lys Thr 

500 505 510 

Gly Ala Gin Ala Asp lie Ala Val Thr Met Thr Leu Asn Gly Asn Thr 

515 520 525 

Leu Ser Asn lie Lys Asn Gly Ser Ala Gin Leu Thr Ser Gly Thr Asp 

530 535 540 

Tyr Ser Thr Ser Gly Ser Thr val Thr lie Lys Lys Glu Tyr Leu Ala 
545 550 555 560 

Lys Gin Ala Asn Gly Thr val Thr Leu Thr Phe Thr Phe ser Ala Gly 

565 570 575 

Ala Ala Gin Thr lie Asp lie Thr Val Lys Asp Thr Thr Gly Gly Ala 

580 585 590 

Ala Gly lie Lys Tyr Asn Phe Ala Thr Asp Asn Leu Pro Asn Gly Tyr 

595 600 605 

Pro Lys Tyr Ser Ser Ser Asp lie ser Ala Thr lie Thr Gly Gly Ala 

610 615 620 

Leu val lie Thr Lys Thr Gly Asn Asn ser ser pro Lys lie Thr Leu 
625 630 635 640 

Pro Phe ser val Thr Gly Asn Leu Ser Gly Tyr Thr Gly lie Lys lie 

645 650 655 

Asn Val Lys Gly Val Ser Gly Asp Phe Thr Tyr Lys Val Leu Asn Ala 

660 665 670 

Ala lie Gly ser Thr Asn Leu Gly Ser Val Asn Asn Ala Pro lie Pro 

675 680 685 

Asn Gly ser Phe Gly Asp val Thr lie Pro lie Thr Gly Gly Thr Asn 

690 695 700 

Thr Gly Asp Leu Asp lie Ser 
705 710 

<210> 5 
<211> 2106 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 5 

atgcaaaacc tatttaagcg tgtgtttttc catcttctct tgcttgcctt gctggcaggc 60 

tgtgctggcc cttctcccgt aacaccggag ccgaccgaaa tgccgaccca ggtccctaca 120 

ccaacgccta gtcttggcgc ctacgagagc ggcgagtatc gcaacctgtt cgccgaggcg 180 

cttggcaaat cggatgccga aattcaggcc aaaatcgatg ccgctttcca acaacttttc 240 

tacggcgacg atgtttctga gcgcgtctat tacccggttg gcagcgacat gggctatatg 300 

ctcgacaccg gcaacgacga tgtgcgctcc gagggcatgt cctacggcat gatgattgcc 360 

gtccagatga acaagaagga agaattcgac cgcatctgga agtggaccaa aacctacatg 420 

taccagaccg aaggtggtta caaaggttat tttgcctggc acgctaaaac ggacggcacc 480 

caactggccg ccaacccggc ctctgacggt gaagtctggt ttgtgatggc gctcttcttt 540 

gccgatgcgc gttggggcag cggcgaagga atttataact accgcgccca agcccaggaa 600 

attctcgatg tggccttgaa cgccaaagaa ttgggcggca acctggcgac caacctgttc 660 

gacccggaga ccaaacaggt cgtttttgtg ccgcagttgg gcaataactc gaaatttacc 720 

gacgcttcgt accacatgcc ccatttctac gagttgtggg cgcgttgggc cgataaaaat 780 
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aacgactttt 
cccgaaaccg 
tatcacggcc 
gtctggtttc 
gcatcccagg 
gggcatcgcg 
gaaattggtc 
cgctactatg 
atttacgagc 
gagggccgct 
ggcgtcaacg 
tcgctcaaat 
acgctttcgg 
gaattgctcg 
tttgatttgg 
cataacgaga 
gcctgcgagt 
gtggattggg 
gaggccgtct 
gcagaggaca 
ctgaacaccg 
atcggtcagc 
gagtaa 



gggccgaagc 
gcctggcccc 
agttccgcta 
acccctctga 
gcatcgatga 
ctacggggtt 
aacccttcgt 
acggcctgct 
cgggtattac 
tcgcgcccat 
cttacttcga 
cgcctgattt 
tcgggttgtc 
acgcgttggc 
cggcgagcgg 
ttcaggcgcg 
cgccctttgg 
tggccgtttc 
tgcagtttgc 
tcttcgagtt 
agccgggcct 
agttctggct 



cgctaccgtt 
taactattcc 
cgacgctttc 
atggtatcgg 
ttatgttgcc 
gattgccacc 
ccaggccctg 
ctacatgatg 
gcctcgcgct 
taccgggcgg 
caaactggtg 
ggaagcgctc 
gctggatggc 
tgtttatccg 
ccaggggccg 
gggtagttcg 
cggtcatccg 
gcgcactgcg 
gcgtgagcga 
cgtttacgcc 
gttcgacacc 
gcgcggcggc 



agccgcgagt 
tacttcgatg 
cgcgtgggcg 
gaacaagcca 
gaatattccc 
aatgctgtcc 
tgggatgcag 
ggcctgctgc 
gagttgccgc 
gccttgcttc 
acagcgccgg 
gacgccctgg 
ccggtaacag 
cgcccggtct 
gaggaatatg 
aatatcgccc 
ctcgaagcgt 
cagtctgccg 
tacaaaccgg 
aacaacgacg 
cccgaatttt 
ccggcgcttt 



tcctgcctac 
gccgccctta 
cgaacatcgg 
accgccaatt 
tggatggaaa 
tggcctacgc 
agcctccgac 
aagccagcgg 
ccccgccgcc 
tgattggccc 
gcggcgtgaa 
cgaggaaata 
aggcggatgc 
tcctgcgcat 
tcgcggcctg 
tggtgtggca 
ggtatcccgg 
attgcgaggg 
ttgtgttggt 
tgattcgcgc 
tgagcggctg 
tttcgacact 



tgccgttcac 
caatgacgag 
catggattat 
atctttcttc 
accgctggcc 
cgcagacccc 
tggcaggtat 
caacttccgt 
tcgcgccatc 
gaatgcggat 
tgtcgaacta 
tcccaacagc 
gcgggtggga 
cgggccggaa 
gaaaacgctc 
tagcgccgca 
tgatgagttt 
gcagtccgtt 
tgcatcgcca 
cctgctgtat 
gaaggccgaa 
cggattggat 



<210> 6 

<211> 701 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> CD -.-(47) 

<400> 6 

Met Gin Asn Leu Phe 
1 5 
Leu Leu Ala Gly cys 
20 

Glu Met Pro Thr Gin 
35 

Glu Ser Gly Glu Tyr 
50 

Asp Ala Glu lie Gin 
65 

Tyr Gly Asp Asp Val 
85 

Met Gly Tyr Met Leu 
100 

Met ser Tyr Gly Met 
115 

Phe Asp Arg lie Trp 
130 

Gly Gly Tyr Lys Gly 
145 

Gin Leu Ala Ala Asn 
165 

Ala Leu Phe Phe Ala 
180 

Asn Tyr Arg Ala Gin 
195 

Lys Glu Leu Gly Gly 
210 

Lys Gin val val Phe 
225 

Asp Ala Ser Tyr His 
245 

Ala Asp Lys Asn Asn 



an environmental sample 



Lys Arg 

Ala Gly 

val Pro 

Arg Asn 
55 

Ala Lys 
70 

Ser Glu 

Asp Thr 

Met lie 

Lys Trp 
135 
Tyr Phe 
150 

pro Ala 
Asp Ala 
Ala Gin 



Val Phe Phe His Leu Leu 
10 

Pro Val Thr Pro 



Asn Leu 
215 
val Pro 
230 

Met Pro 



Pro Ser 

25 
Thr Pro 
40 

Leu Phe 

lie Asp 

Arg Val 

Gly Asn 
105 
Ala Val 
120 

Thr Lys 

Ala Trp 

Ser Asp 

Arg Trp 
185 
Glu He 
200 

Ala Thr 



Gin Leu 
His Phe 
Asp Phe Trp Ala 



Thr Pro Ser Leu 
45 

Ala Glu Ala Leu 
60 

Ala Ala Phe Gin 
75 

Tyr Tyr Pro Val 
90 

Asp Asp val Arg 

Gin Met Asn Lys 
125 

Thr Tyr Met Tyr 
140 

His Ala Lys Thr 
155 

Gly Glu val Trp 
170 

Gly Ser Gly Glu 

Leu Asp Val Ala 
205 

Asn Leu Phe Asp 
220 

Gly Asn Asn ser 
235 

Tyr Glu Leu Trp 
250 

Glu Ala Ala Thr 
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Leu Leu Ala 
15 

Glu Pro Thr 
30 

Gly Ala Tyr 

Gly Lys Ser 

Gin Leu Phe 
80 

Gly ser Asp 
95 

ser Glu Gly 
110 

Lys Glu Glu 

Gin Thr Glu 

Asp Gly Thr 
160 

Phe val Met 

175 
Gly lie Tyr 
190 

Leu Asn Ala 

Pro Glu Thr 

Lys Phe Thr 
240 

Ala Arg Trp 

255 
Val ser Arg 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2106 
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260 265 
Pro Thr Ala Val His Pro Glu Thr Gly Leu 
280 285 
Phe Asp Gly Arg Pro Tyr Asn Asp Glu Tyr 

295 300 
Asp Ala Phe Arg Val Gly Ala Asn lie Gly 

310 315 
His Pro ser Glu Trp Tyr Arg Glu Gin Ala 

325 330 
Phe Ala ser Gin Gly lie Asp Asp Tyr val 
340 345 
Gly Lys Pro Leu Ala Gly His Arg Ala Thr 
360 365 
Ala val Leu Ala Tyr Ala Ala Asp Pro Glu 

375 380 
Gin Ala Leu Trp Asp Ala Glu Pro Pro Thr 

390 395 
Asp Gly Leu Leu Tyr Met Met Gly Leu Leu 

405 410 
Arg lie Tyr Glu Pro Gly lie Thr Pro Arg 
420 425 
Pro Pro Arg Ala lie Glu Gly Arg Phe Ala 
440 445 
Leu Leu Leu lie Gly Pro Asn Ala Asp Gly 

455 460 
Lys Leu val Thr Ala Pro Gly Gly Val Asn 

470 475 
Ser Pro Asp Leu Glu Ala Leu Asp Ala Leu 

485 490 
Ser Thr Leu Ser Val Gly Leu ser Leu Asp 
500 505 
Asp Ala Arg Val Gly Glu Leu Leu Asp Ala 
520 525 
Pro val Phe Leu Arg lie Gly Pro Glu Phe 

535 540 
Gin Gly Pro Glu Glu Tyr Val Ala Ala Trp 

550 555 
lie Gin Ala Arg Gly ser ser Asn lie Ala 

565 570 
Ala Ala Cys Glu Ser Pro Phe Gly Gly His 
580 585 
Pro Gly Asp Glu Phe Val Asp Trp Val Ala 
600 605 
Ser Ala Asp Cys Glu Gly Gin Ser Val Glu 

615 620 
Arg Glu Arg Tyr Lys Pro val val Leu val 

630 635 
lie Phe Glu Phe Val Tyr Ala Asn Asn Asp 

645 650 
Tyr Leu Asn Thr Glu Pro Gly Leu Phe Asp 
660 665 
Gly Trp Lys Ala Glu lie Gly Gin Gin Phe 
680 685 
Ala Leu Phe ser Thr Leu Gly Leu Asp Glu 
695 700 

<211> 1539 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 7 

atggcacgtt taatcaccta ttgcttgatc ggcgtcttac tcgtgatgcc agtccttgcc 60 
gcttgcagca cagcacctac gccaacgctg atgagccagc caacttccac gccgcaaccg 120 
gccctgcaac cgacgccacc accgacgagc gtcccccggt cgatcggggc gtttgagtcc 180 
ggtcagtatc gtaatctctt cacggaatta ctgggcaaga gcgaggccga gattcagcag 240 

page 6 



Glu 


Phe 


Leu 






275 


Tyr 


Ser 


Tyr 




290 


Phe 


Arg 


Tyr 


305 




val 


Trp 


Phe 


Leu 


Ser 


Phe 


Ser 


Leu 


ASD 








Ala 


Thr 


Asn 




370 




Pro 


Phe 


val 


385 






Arg 


Tyr 


Tyr 


Gly 


Asn 


Phe 


Pro 


Pro 


Pro 






435 


Gly 


Arg 


Ala 


450 




Tyr 


Phe 


ASP 


465 




Ser 


Leu 


Lys 


Tyr 


Pro 


Asn 


Thr 


Glu 


Ala 






515 


Tyr 


Pro 


Arg 


530 




Ala 


Ser 


Gly 


545 




His 


Asn 


Glu 


His 


ser 


Ala 


Ala 


Tro 


Tvr 




595 


Thr 


Ala 


Gin 




610 




Gin 


Phe 


Ala 


625 






Ala 


Glu 


Asp 


Ala 


Leu 


Leu 


Phe 


Leu 


Ser 






675 


Gly 


Gly 


Pro 


690 




<210> 7 





270 






Ala 


Pro 


Asn 


H1 S 


g iy 


Gin 


Met 


Asp 


Tyr 






320 


Asn 


Arg 


Gin 




335 




Ala 


Glu 


Tyr 


350 




Gly 


Leu 


lie 


lie 


Gly 


Gin 


Gly 


Arg 


Tyr 






400 


Gin 


Ala 


Ser 




415 




Ala 


Glu 


Leu 


430 






Pro 


lie 


Thr 


Val 


Asn 


Ala 


va I 


G 1 U 


Leu 






480 


Ala 


Arg 


Lys 




495 


Gly 


Pro 


val 


510 






Leu 


Ala 


Val 


Asp 


Leu 


Ala 


Lys 


Thr 


Leu 




560 


Leu 


val 


Trp 




575 


pro 


Leu 


Glu 


590 






Val 


ser 


Arg 


Ala 


val 


Leu 


Ala 


ser 


Pro 






640 


val 


He 


Arg 




655 


Thr 


Pro 


Glu 


670 






Trp 


Leu 


Arg 
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aagatcgatc aggcgtgggc gcagttgttc tacggcgaca acgacacgca gcgcgtttac 300 

tatcccgtgg gtcgcgacag ggcctacatc aaagacatcg gcaacaatga tgtgcgcagt 360 

gagggtatgt cgtacggtat gatgctggcg gtgcagctgg acaagcagga agagttcaac 420 

aaattgtgga agtgggcgca cacctatatg ctgcaaaagg atggcccgta caaaggctat 480 

tttgcgtggc atgccaatga gaacggtgaa cagctggatg cgggtcccgc ctccgatggc 540 

gaagagtggt ttgtcatggc actgctcttc gcggcaaatc gctggggcaa cggtgaaggc 600 

atctttaatt atcaggccga ggcgcagaag atcctggatg tgatgctgca taagagcgaa 660 

gaggacaacg gtctcgccac cagcatgttc gatccggaca cgaagcaggt ggtgtttgtg 720 

ccggccgggc gccaggccac attcaccgat ccgtcttatc acttgcccgc gttctatgaa 780 

ctgtgggcgc gctgggctga caaggataac gatttttgga aagaagcggc gcaggccagc 840 

cgcgaatttt ggaagaaggc ggcgcatccg gaaacgggcc tgatgtctga ctacgccgag 900 

tttgacggca gaccccaggc cgattctgaa cacaaggatt ttcgctatga cgcgttccgt 960 

gtggcgtcca atgtggcgct cgattgggcc tggttcgccg ccgatccgtg ggaggtggaa 1020 

cagagcaatc ggttgttgga tttcttccgt tcacaaggca tggataagta tccgagtcta 1080 

tacaacatcg atggcacgcc gttatccact aatcgctcgc cgggtttgat cgccatgaac 1140 

gccacagctg gactcgcggc tgatccggaa aagagcaagg actttgtgca ggcgctatgg 1200 

gatctggaaa ttcccagcgg acaatggcgc tattacgatg gggtgctgta tttcctggcg 1260 

ctgttgcaag ccagcggcaa ctatcgcatc tacacgcccg atatgcccaa ggtggtgcgg 1320 

cccacaccta cgcccgatcc gatcacgcaa gcgaaatttg cacccggcga tgacgcggtg 1380 

ctgttcagtg tggaaacaga tgcactcgac gaatatgtga cggcgacggg ctttgagccg 1440 

ggcggcgtga tgttgaacac tactttggac agcgcctctt ttgacgcacc actgcctgac 1500 

agcgctctgc tgatcggatt ggacgtcagc gatcaataa 1539 

<210> 8 
<211> 512 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (57) 

<400> 8 

Met Ala Arg Leu lie Thr Tyr Cys Leu lie Gly Val Leu Leu val Met 
1 5 10 15 

Pro val Leu Ala Ala cys Ser Thr Ala Pro Thr Pro Thr Leu Met Ser 

20 25 30 

Gin Pro Thr Ser Thr Pro Gin Pro Ala Leu Gin Pro Thr Pro Pro Pro 

35 40 45 

Thr Ser Val Pro Arg Ser He Gly Ala Phe Glu Ser Gly Gin Tyr Arg 

50 55 60 

Asn Leu Phe Thr Glu Leu Leu Gly Lys Ser Glu Ala Glu lie Gin Gin 
65 70 75 80 

Lys lie Asp Gin Ala Trp Ala Gin Leu Phe Tyr Gly Asp Asn Asp Thr 

85 90 95 

Gin Arg Val Tyr Tyr Pro Val Gly Arg Asp Arg Ala Tyr lie Lys Asp 

100 105 110 

lie Gly Asn Asn Asp Val Arg Ser Glu Gly Met Ser Tyr Gly Met Met 

115 120 125 

Leu Ala val Gin Leu Asp Lys Gin Glu Glu Phe Asn Lys Leu Trp Lys 

130 135 140 

Trp Ala His Thr Tyr Met Leu Gin Lys Asp Gly Pro Tyr Lys Gly Tyr 
145 150 155 160 

Phe Ala Trp His Ala Asn Glu Asn Gly Glu Gin Leu Asp Ala Gly Pro 

165 170 175 

Ala Ser Asp Gly Glu Glu Trp Phe Val Met Ala Leu Leu Phe Ala Ala 

180 185 190 

Asn Arg Trp Gly Asn Gly Glu Gly lie Phe Asn Tyr Gin Ala Glu Ala 

195 200 205 

Gin Lys lie Leu Asp Val Met Leu His Lys Ser Glu Glu Asp Asn Gly 

210 215 220 

Leu Ala Thr ser Met Phe Asp Pro Asp Thr Lys Gin Val Val Phe Val 
225 230 235 240 

pro Ala Gly Arg Gin Ala Thr Phe Thr Asp Pro Ser Tyr His Leu Pro 

245 250 255 

Ala Phe Tyr Glu Leu Trp Ala Arg Trp Ala Asp Lys Asp Asn Asp Phe 
260 265 270 
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Trp 


Lys 


Glu 


Ala 


Ala 


Gin 


Ala 


Ser 


Arg 


Glu 


Phe 


Trp 


Lys 


Lys 


Ala 


Ala 


275 










280 








285 




Gl v 

u i y 




ni o 


DpA 

rro 


tzl it 
la 1 U 


i nr 


V3 i y 








A CO 
Mop 


T\/r 
lyr 


Ala 

r\ 1 CI 


Vj 1 U 


I 1 IC 




Arg 




290 








295 






300 




Ala 


Phe 




Pro 


Gin 


Ala 


Asp 


ser 


Glu 


His 


Lys 


Asp 


Phe 


Arg 


Tyr 


ASp 


Arg 


305 








310 






315 










320 


Val 


Ala 


Ser 


Asn 


val 


Ala 


Leu 


Asp 


Trp 


Ala 


Trp 


Phe 


Ala 


Ala 


Asp 


Pro 










325 






330 








335 


Gin 


Trp 


Glu 


Val 


Glu 


Gin 


Ser 


Asn 


Arg 


Leu 


Leu 


Asp 


Phe 


Phe 


Arg 


Ser 






340 








345 






Gly 

365 


350 






Gly 


Met 


Asp 


Lys 


Tyr 


Pro 


ser 


Leu 
360 


Tyr 


Asn 


lie 


Asp 


Thr 


pro 


Leu 


Ser 


Thr 


355 
Asn 


Arg 


Ser 


Pro 


Gly 


Leu 


He 


Ala 


Met 


Asn 


Ala 


Thr 


Ala 


Gly 




370 








375 










380 


Gin 


Ala 




Leu 


Ala 


Ala 


Asp 


Pro 


Glu 


Lys 


Ser 


Lys 


Asp 


Phe 


Val 


Leu 


Trp 


385 








390 




395 








val 


400 


Asp 


Leu 


Glu 


He 


Pro 


Ser 


Gly 


Gin 


Trp 


Arg 


Tyr 


Tyr 


Asp 


Gly 


Leu 








405 






410 






415 


Thr 


Tyr 


Phe 


Leu 


Ala 


Leu 


Leu 


Gin 


Ala 


ser 


Gly 


Asn 


Tyr 


Arg 


He 


Tyr 






420 










425 






430 




lie 


Pro 


Asp 


Met 


Pro 


Lys 


val 


Val 


Arg 


Pro 


Thr 


Pro 


Thr 


Pro 


Asp 


Pro 




435 








440 










445 








Thr 


Gin 


Ala 


Lys 


Phe 


Ala 


Pro 


Gly 


Asp 


Asp 


Ala 


Val 


Leu 


Phe 


Ser 


val 




450 








455 






460 










Glu 


Thr 


Asp 


Ala 


Leu 


Asp 


Glu 


Tyr 


val 


Thr 


Ala 


Thr 


Gly 


Phe 


GlU 


pro 


465 








470 








475 




Phe 




480 


Gly 


Gly 


val 


Met 


Leu 


Asn 


Thr 


Thr 


Leu 


Asp 


Ser 


Ala 


ser 


Asp 


Ala 






485 










490 






Val 




495 


Gin 


Pro 


Leu 


Pro 


Asp 


Ser 


Ala 


Leu 


Leu 


He 


Gly 


Leu 


Asp 


Ser 


Asp 



500 

<210> 9 
<211> 1311 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 



<400> 9 

atgtttccac 

ggccttgtgt 

gctggtgccg 

atagacatcc 

gatgcagctg 

gatgtgaaca 

caaatggaca 

caagactccc 

gtcgccaatg 

ttcgccgccg 

accattttga 

atgactgcga 

aataatgctg 

cgtgtcgcgc 

tattttgcca 

ggcaccccgt 

cgttccgtca 

gcgcgcagtg 

tatagcctgg 

gcaacggcag 

caacaacaac 

ctgctacatt 



gtctttcacc 
cactgaccgg 
ttgctaccgg 
agcgcaaaat 
tctactatca 
gcaatgacgt 
aaaaagccga 
ccacgcatcc 
acgatatgcc 
cccgctgggg 
gccgcatgcg 
ccaatctgtt 
atcatacaga 
cgcaagaaga 
aagccgccca 
gggcggcatc 
tgaactggtc 
ataaattact 
atggcaaacc 
ctatggcagc 
cccccacagg 
gcgctgggga 



aagccgcttc 
ttgtgcaggt 
cgagtaccgc 
tgacgaggcg 
agcgggtgga 
gcgctcagaa 
gttcgatgca 
agcgtttggt 
agcgccagat 
taatggcgaa 
ccaccgccag 
ccacccggaa 
cgcgtcttac 
tcgcgcgttt 
ccctgtcact 
ctggcggccg 
catggactat 
cgcgttcttc 
gctgggtggt 
tactgatccc 
gcaataccgg 
gtacaaagcg 



aggcaagtta 
aacagcaagc 
aatctgtttg 
tttcagcact 
aacgagaatg 
ggcatgagct 
atctggaact 
tactttgcct 
ggcgaggaat 
ggtattttca 
gtgatcaccg 
gaggcgcaag 
catctgccct 
tgggccaaag 
gcgttaacac 
gagtcggtag 
gcctggtggg 
gaaacccagg 
ggaccgaccc 
cgctggcaca 
tactacgacg 
tggatccccg 



ccttaacctt 
cggatgcaga 
ccgaaatcgg 
tgttttatgg 
gtccactcgc 
acggcatgat 
gggcgaaaac 
ggtccatgcg 
atttcgtgac 
actaccaaca 
gcccaaccaa 
tgcgcttcac 
cgttctatga 
cggccgatgt 
cggactacgg 
attttcgata 
gcaaagattc 
aaggcaaaat 
tcggcctaat 
attttgtgga 
gtgttctata 
acggggaata 



gctcacgctc 
caccagtact 
aaaaagcgaa 
cgacgcgaaa 
atatgtttac 
gattactgtt 
ctatatgtat 
ccgcgatggt 
cgctctctat 
ggaagcggac 
tcgcggagta 
gcccgacatc 
aatttgggca 
gagccgcgac 
taattttgat 
cgatgcctgg 
aggcgcacct 
gaaccacctc 
ttccatgaat 
aaagctctgg 
cctgatggcg 
a 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1311 



<210> 10 
<211> 436 
<212> PRT 
<213> Unknown 

<220> 
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<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...C36) 

<400> 10 , , 

Met Phe Pro Arg Leu Ser Pro Ser Arg Phe Arg Gin Val Thr Leu Thr 
15 10 15 

Leu Leu Thr Leu Gly Leu Val Ser Leu Thr Gly Cys Ala Gly Asn Ser 

20 25 30 

Lys Pro Asp Ala Asp Thr ser Thr Ala Gly Ala val Ala Thr Gly Glu 

35 40 45 

Tyr Arg Asn Leu Phe Ala Glu lie Gly Lys ser Glu lie Asp He Gin 

SO 55 60 

Arg Lys lie Asp Glu Ala Phe Gin His Leu Phe Tyr Gly Asp Ala Lys 
65 70 75 80 

Asp Ala Ala Val Tyr Tyr Gin Ala Gly Gly Asn Glu Asn Gly Pro Leu 

85 90 95 

Ala Tyr Val Tyr Asp val Asn ser Asn Asp val Arg ser Glu Gly Met 

100 105 110 

Ser Tyr Gly Met Met lie Thr val Gin Met Asp Lys Lys Ala Glu Phe 

115 120 125 

Asp Ala lie Trp Asn Trp Ala Lys Thr Tyr Met Tyr Gin Asp Ser Pro 

K 130 135 140 

Thr His Pro Ala Phe Gly Tyr Phe Ala Trp ser Met Arg Arg Asp Gly 
145 150 155 160 

val Ala Asn Asp Asp Met Pro Ala Pro Asp Gly Glu Glu Tyr Phe Val 

165 170 175 

Thr Ala Leu Tyr Phe Ala Ala Ala Arg Trp Gly Asn Gly Glu Gly lie 

180 185 190 

Phe Asn Tyr Gin Gin Glu Ala Asp Thr He Leu Ser Arg Met Arg His 

195 200 205 

Arg Gin Val He Thr Gly Pro Thr Asn Arg Gly Val Met Thr Ala Thr 

210 215 220 

Asn Leu Phe His Pro Glu Glu Ala Gin val Arg Phe Thr Pro Asp lie 
225 230 235 240 

Asn Asn Ala Asp His Thr Asp Ala ser Tyr His Leu Pro Ser Phe Tyr 

245 250 255 

Glu lie Trp Ala Arg Val Ala Pro Gin Glu Asp Arg Ala Phe Trp Ala 

260 265 270 

Lys Ala Ala Asp Val Ser Arg Asp Tyr Phe Ala Lys Ala Ala His Pro 

275 280 285 

val Thr Ala Leu Thr Pro Asp Tyr Gly Asn Phe Asp Gly Thr Pro Trp 

290 295 300 

Ala Ala Ser Trp Arg Pro Glu Ser val Asp Phe Arg Tyr Asp Ala Trp 
305 310 315 320 

Arg ser Val Met Asn Trp Ser Met Asp Tyr Ala Trp Trp Gly Lys Asp 

325 330 335 

Ser Gly Ala Pro Ala Arg ser Asp Lys Leu Leu Ala Phe Phe Glu Thr 

340 345 350 

Gin Glu Gly Lys Met Asn His Leu Tyr Ser Leu Asp Gly Lys Pro Leu 

355 360 365 

Gly Gly Gly Pro Thr Leu Gly Leu lie Ser Met Asn Ala Thr Ala Ala 

370 375 380 

Met Ala Ala Thr Asp Pro Arg Trp His Asn Phe Val Glu Lys Leu Trp 
385 390 395 400 

Gin Gin Gin Pro Pro Thr Gly Gin Tyr Arg Tyr Tyr Asp Gly val Leu 

405 410 415 

Tyr Leu Met Ala Leu Leu His cys Ala Gly Glu Tyr Lys Ala Trp lie 

420 425 430 

Pro Asp Gly Glu 
435 

<210> 11 
<211> 1224 
<212> DNA 
<213> unknown 

<220> 

Page 9 



WO 03/106654 



PCT/US03/19153 



<223> Obtained from an environmental sample 



<400> 11 

atgcggaacg 

atgggaatga 

agcgcattga 

gcagtagaac 

aacagcattg 

ttcaattttg 

cgcttccata 

ggtaagccaa 

ttaaaacgac 

tactgggacg 

tatcaaatcg 

ggagacaaca 

gctctttaca 

catcaatccc 

atgttcgccg 

tggccgccgc 

gcagcgcgct 

gtcaccttct 

tatgacgcca 

gggaaaggaa 

tgggctatta 



tcgtgcgtaa 
cggcaacatc 
atgccccaca 
cttatcaact 
ttgccgagaa 
aacaagcgga 
cactcgtttg 
tggtcaatga 
ttgaaactca 
ttgtaaatga 
ccggcatcga 
ttaagcttta 
atttagtcaa 
acatccaaat 
ctttcggttt 
gcgcttaccc 
atgatcgttt 
ggggcatcgc 
acgggaatgt 
aagatgcgcc 
ttgaccacaa 



accattgaca 
agcgaagaat 
attggatcaa 
acaaaatgaa 
cgtaatgaaa 
tcgaattgtg 
gcacagccaa 
aacagatcca 
tattaaaacg 
ggttgtgggg 
ttatattaaa 
catgaatgat 
acaactgaaa 
cggctggcct 
agacaaccaa 
gacgtatgac 
gttcaaactg 
cgacaatcat 
tgtggttgac 
gttcgttttt 
atag 



atcggactcg 
gcagattcct 
cgctacaaaa 
aaagacgtac 
ccgatcagca 
aagttcgcta 
gtacctcaac 
gtgaaacgtg 
atcgtcgagc 
gacgacggaa 
gtggcattcc 
tacaatacag 
gaagagggtg 
tctgaagcag 
atcactgagc 
gccattccaa 
tatgagaagt 
acgtggctcg 
ccgaacgctc 
ggaccggatt 



ctttaacact 
atgcgaaaaa 
acgagttcac 
aaatgctaaa 
ttcaacctga 
aggcaaatgg 
ggttctttct 
aacaaaataa 
ggtacaaaga 
aactgcgcaa 
aagcagctag 
aagtcgaacc 
ttccgatcga 
aaatcgagaa 
ttgatgtgag 
aacaaaagtt 
tgagcgataa 
acagccgtgc 
cgtacgcaaa 
acaaagtcaa 



attattgccc 
acctcacatc 
gattggtgcg 
gcgccacttc 
ggaaggaaaa 
catggatatt 
tgacaaggaa 
acaactgctg 
tgacattaag 
ctctccatgg 
aaaatatggc 
gaagcgaacc 
cggcatcggc 
aacgattaac 
catgtacggt 
tttggatcag 
aattagcaac 
ggatgtgtac 
agtggaaaaa 
acccgcatat 



<210> 12 
<211> 407 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (1)...(28) 

<400> 12 

Met Arg Asn val val 
1 5 
Leu Leu Leu Pro Met 
20 

ser Tyr Ala Lys Lys 

Asp Gin Arg Tyr Lys 
50 

Tyr Gin Leu Gin Asn 

Asn ser lie Val Ala 
85 

Glu Glu Gly Lys Phe 
100 

Ala Lys Ala Asn Gly 
115 

Ser Gin val Pro Gin 
130 

val Asn Glu Thr Asp 
145 

Leu Lys Arg Leu Glu 
165 

Asp Asp lie Lys Tyr 
180 

Gly Lys Leu Arg Asn 
195 

lie Lys val Ala Phe 
210 

Lys Leu Tyr Met Asn 
225 

Ala Leu Tyr Asn Leu 
245 



an environmental sample 



Arg Lys 

Gly Met 

Pro His 

Asn Glu 

55 
Glu Lys 
70 

Glu Asn 

Asn Phe 

Met Asp 

Arg Phe 
135 
Pro val 
150 

Thr His 

Trp Asp 

ser Pro 

Gin Ala 
215 
Asp Tyr 
230 

val Lys 



pro Leu Thr lie Gly Leu 
10 

Thr Ser Ala Lys 



Thr Ala 

25 
lie Ser 
40 

Phe Thr 

Asp Val 

val Met 

Glu Gin 
105 
lie Arg 
120 

Phe Leu 

Lys Arg 

He Lys 

val val 
185 
Trp Tyr 
200 

Ala Arg 
Asn Thr 
Gin Leu 



Ala Leu Asn Ala 
45 

He Gly Ala Ala 
60 

Gin Met Leu Lys 
75 

Lys Pro lie Ser 
90 

Ala Asp Arg lie 

Phe His Thr Leu 
125 

Asp Lys Glu Gly 
140 

Glu Gin Asn Lys 
155 

Thr He Val Glu 
170 

Asn Glu val val 

Gin lie Ala Gly 

205 

Lys Tyr Gly Gly 
220 

Glu Val Glu Pro 
235 

Lys Glu Glu Gly 
250 
page 10 



Ala Leu Thr 
15 

Asn Ala Asp 
30 

Pro Gin Leu 

val Glu Pro 

Arg His Phe 
80 

lie Gin Pro 
95 

val Lys Phe 
110 

val Trp His 

Lys Pro Met 

Gin Leu Leu 
160 

Arg Tyr Lys 

175 
Gly Asp Asp 
190 

lie Asp Tyr 

Asp Asn lie 

Lys Arg Thr 
240 

val Pro lie 
255 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1224 



WO 03/106654 PCT/US03/19153 

Asp Gly lie Gly His Gin Ser His lie Gin lie Gly Trp Pro Ser Glu 

260 265 270 

Ala Glu lie Glu Lys Thr lie Asn Met Phe Ala Ala Phe Gly Leu Asp 

275 280 285 

Asn Gin lie Thr Glu Leu Asp val Ser Met Tyr Gly Trp Pro Pro Arg 

290 295 300 

Ala Tyr Pro Thr Tyr Asp Ala lie Pro Lys Gin Lys Phe Leu Asp Gin 
305 310 315 320 

Ala Ala Arg Tyr Asp Arg Leu Phe Lys Leu Tyr Glu Lys Leu Ser Asp 

325 330 335 

Lys lie Ser Asn Val Thr Phe Trp Gly lie Ala Asp Asn His Thr Trp 

340 345 350 

Leu Asp Ser Arg Ala Asp Val Tyr Tyr Asp Ala Asn Gly Asn Val Val 

355 360 365 

Val Asp Pro Asn Ala Pro Tyr Ala Lys Val Glu Lys Gly Lys Gly Lys 

370 375 380 

Asp Ala Pro Phe Val Phe Gly Pro Asp Tyr Lys val Lys Pro Ala Tyr 
385 390 395 400 

Trp Ala lie lie Asp His Lys 
405 

<210> 13 
<211> 1053 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 13 

atgaaagacg cgctccagtg ctctcccctt ttcaaagcct atgaaaaata cttccgcatc 60 

ggcgcggcgg ttagcagctt catgaccttt gatcccgctt accgcgccct gatccgccgc 120 

cattacaatt ccctgacggc ggacaaccag atgaagccgg aaagcgtgtt ggatcgcacc 180 

gcgaccctgg cgaagggcga cctgctccac gctgcggtgg atttcacccg tgtggacgcg 240 

ctgatgtact ttgcacggga caacgggatc cccatgcggt atcacaccct ggcctggcac 300 

aaccagacgc cccgctggtt cttcgcgaag gactggagcg acgcggaaag cgccgaaccc 360 

gcctcaaagg aaaccatgct tgcccgtctg gaaaactata tcctggatgt catgaaccat 420 

gtgaatacca agtttcccgg tctggtttac acctgggacg tggtaaacga agccattgag 480 

ccagagctga aagccccggg attgtaccgg acctggagcc cctggttcaa aacctgcgga 540 

gaagatttcc tctttaccgc tttccgggcc gcccgcaagg gacaggcgcc cggtcagacc 600 

ctttgctata acgactataa cgccttcgag cccgtcaagc gggacgcgat tatcgatctg 660 

ctgaagaagc tgcaggcgga aaacctggtg gataccatgg gtatgcaggg gcattatgtc 720 

atggactgga tgaacatctc gctctgcgaa gaggccgccc gcgcctatgc cgccctgggc 780 

ctgaaggtcc aggtcaccga gctggatatc cactgcaaca gcgacgatga agcccacagc 840 

caaaagctgg cgcagcttta cggcgattat ttcgccatgc tgaagaagct gaaggaggaa 900 

ggcgtcgaca tcgaagccgt caccttctgg ggcgtcaccg accaggacag ctggctcacc 960 

ggtttccgta aagagacaag ctatcccctc ctcttcgacc gcgccaagca ggccaaggat 1020 

gcctatgacg ccgtcatgaa agccgcggaa taa 1053 

<210> 14 
<211> 350 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 14 

Met Lys Asp Ala Leu Gin cys ser Pro Leu Phe Lys Ala Tyr Glu Lys 

1 5 10 15 

Tyr Phe Arg lie Gly Ala Ala Val ser ser Phe Met Thr Phe Asp Pro 

20 25 30 

Ala Tyr Arg Ala Leu lie Arg Arg His Tyr Asn ser Leu Thr Ala Asp 

35 40 45 

Asn Gin Met Lys Pro Glu ser val Leu Asp Arg Thr Ala Thr Leu Ala 

50 55 60 

Lys Gly Asp Leu Leu His Ala Ala Val Asp Phe Thr Arg Val Asp Ala 
65 70 75 80 

Leu Met Tyr Phe Ala Arg Asp Asn Gly lie Pro Met Arg Tyr His Thr 
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85 90 95 

Leu Ala Trp His Asn Gin Thr Pro Arg Trp Phe Phe Ala Lys Asp Trp 

100 105 110 

ser Asp Ala Glu Ser Ala Glu Pro Ala Ser Lys Glu Thr Met Leu Ala 

115 120 125 

Arg Leu Glu Asn Tyr lie Leu Asp val Met Asn His Val Asn Thr Lys 

130 135 140 

Phe Pro Gly Leu Val Tyr Thr Trp Asp Val val Asn Glu Ala lie Glu 
145 150 155 160 

Pro Glu Leu Lys Ala Pro Gly Leu Tyr Arg Thr Trp Ser Pro Trp Phe 

165 170 175 

Lys Thr Cys Gly Glu Asp Phe Leu Phe Thr Ala Phe Arg Ala Ala Arg 

180 185 190 

Lys Gly Gin Ala Pro Gly Gin Thr Leu Cys Tyr Asn Asp Tyr Asn Ala 

195 200 205 

Phe Glu Pro val Lys Arg Asp Ala lie lie Asp Leu Leu Lys Lys Leu 

210 215 220 

Gin Ala Glu Asn Leu Val Asp Thr Met Gly Met Gin Gly His Tyr Val 
225 230 235 240 

Met Asp Trp Met Asn lie Ser Leu Cys Glu Glu Ala Ala Arg Ala Tyr 

245 250 255 

Ala Ala Leu Gly Leu Lys Val Gin Val Thr Glu Leu Asp lie His cys 

260 265 270 

Asn Ser Asp Asp Glu Ala His Ser Gin Lys Leu Ala Gin Leu Tyr Gly 

275 280 285 

Asp Tyr Phe Ala Met Leu Lys Lys Leu Lys Glu Glu Gly Val Asp lie 

290 295 300 

Glu Ala Val Thr Phe Trp Gly val Thr Asp Gin Asp Ser Trp Leu Thr 
305 310 315 320 

Gly Phe Arg Lys Glu Thr ser Tyr Pro Leu Leu Phe Asp Arg Ala Lys 

325 330 335 

Gin Ala Lys Asp Ala Tyr Asp Ala Val Met Lys Ala Ala Glu 
340 345 350 

<210> 15 
<211> 1110 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 15 

atgaaacgtc ctctagtcaa tctcctgaca accgcctgcc tcctcgttgc cgcaaatgct 60 

gcagaaccca ccctccgcga agcctacgaa aagcactttg ccgtgggtgt cgcactcaat 120 

accgctcaag tgactggtcg aaacaaagcc gcaggcgaac tcgccgcgaa gcagttcaat 180 

tccatcaccg ctgagaatga catgaagtgg caatcgcttc atccagagct cgatacctac 240 

cgctttgaat cggccgatgc ctatatcgac tttgccaaaa agaatgagat ggaagtcata 300 

ggccacactc tcgtctggca cagccagacc cctcagtggg tgttccaagg cgacgatggc 360 

aaacccgcga cacgggaaga acttctcaag cggatgcgcg atcacattca caaggtcgcc 420 

ggccgataca agggtaaggt caagggctgg gacgtcgtca atgaggcgct ctccgacgga 480 

ggtcaggaca ttctacgcga atctccgtgg cggcgaatca tcggagacga tttcatcgat 540 

cacgctttcc gctacgcccg cgaagccgac ccaaaggcag aactttacta caacgactac 600 

aacctcgaaa tccctcgcaa acgcgagaac tgcatcaagc tcgtcaaggg catgcttgag 660 

cgcggcgtcc ccatcgacgg cattggaacg caatcccatt ttcagcttgg cttcccatcg 720 

ctggaagatg tcgagaccac gattgaagag tttggaaaac tcggccttaa ggtcatgatt 780 

accgaactcg atgtggatgt cctccctcgc aataacccag gcgtcgccga catcagtcag 840 

cgcgagcaag gtagcaatcc ctacactgag ggcctgcccg aggatgttca aaagcagctt 900 

acgaaacgct acgaagacat cttcaagatc tacctaaagc accagaaaac ggtcacccgc 960 

gtgaccttct ggggcctcga tgatggtcaa tcatggttga atggctttcc tgttagaggc 1020 

cgcaccaatc acccgctact tttcgatcgt gaactcaaac cgaagcccgt tcttccagtc 1080 

ttgatagagc tcggcaagaa gaagcgataa 1110 

<210> 16 
<211> 369 
<212> PRT 
<213> Unknown 

<220> 
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<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(20) 

<400> 16 

Met Lys Arg Pro Leu val Asn Leu Leu Thr Thr Ala Cys Leu Leu Val 

15 10 15 

Ala Ala Asn Ala Ala Glu Pro Thr Leu Arg Glu Ala Tyr Glu Lys His 

20 25 ™ 30 

Phe Ala Val Gly Val Ala Leu Asn Thr Ala Gin Val Thr Gly Arg Asn 

35 40 45 

Lys Ala Ala Gly Glu Leu Ala Ala Lys Gin Phe Asn Ser lie Thr Ala 

50 55 60 

Glu Asn Asp Met Lys Trp Gin Ser Leu His Pro Glu Leu Asp Thr Tyr 
65 70 75 80 

Arg Phe Glu Ser Ala Asp Ala Tyr He Asp Phe Ala Lys Lys Asn Glu 

85 90 95 

Met Glu Val lie Gly His Thr Leu Val Trp His Ser Gin Thr Pro Gin 

100 105 110 

Trp Val Phe Gin Gly Asp Asp Gly Lys Pro Ala Thr Arg Glu Glu Leu 

115 120 125 

Leu Lys Arg Met Arg Asp His He His Lys Val Ala Gly Arg Tyr Lys 

130 135 140 

Gly Lys Val Lys Gly Trp Asp Val val Asn Glu Ala Leu Ser Asp Gly 
145 150 155 160 

Gly Gin Asp lie Leu Arg Glu ser Pro Trp Arg Arg lie lie Gly Asp 

165 170 175 

Asp Phe lie Asp His Ala Phe Arg Tyr Ala Arg Glu Ala Asp Pro Lys 

180 185 190 

Ala Glu Leu Tyr Tyr Asn Asp Tyr Asn Leu Glu lie Pro Arg Lys Arg 

195 200 205 

Glu Asn Cys He Lys Leu Val Lys Gly Met Leu Glu Arg Gly Val Pro 

210 215 220 

lie Asp Gly lie Gly Thr Gin ser His Phe Gin Leu Gly Phe Pro ser 
225 230 235 240 

Leu Glu Asp Val Glu Thr Thr lie Glu Glu Phe Gly Lys Leu Gly Leu 

245 250 255 

Lys Val Met lie Thr Glu Leu Asp Val Asp Val Leu Pro Arg Asn Asn 

260 265 270 

Pro Gly Val Ala Asp lie Ser Gin Arg Glu Gin Gly Ser Asn Pro Tyr 

275 280 285 

Thr Glu Gly Leu Pro Glu Asp val Gin Lys Gin Leu Thr Lys Arg Tyr 

290 295 300 

Glu Asp lie Phe Lys lie Tyr Leu Lys His Gin Lys Thr Val Thr Arg 
305 310 315 320 

val Thr Phe Trp Gly Leu Asp Asp Gly Gin Ser Trp Leu Asn Gly Phe 

325 330 335 

Pro val Arg Gly Arg Thr Asn His Pro Leu Leu Phe Asp Arg Glu Leu 

340 " 345 350 

Lys Pro Lys pro val Leu Pro val Leu lie Glu Leu Gly Lys Lys Lys 
355 360 365 

Arg 

<210> 17 
<211> 1035 
<212> DNA 
<213> Bacteria 

<400> 17 

atgtcccggc acgtcatcgc cctgtccgcc gccgtctgcc tcgcggccgg cctcgccgcc 60 

gcgcccgcga gcgccgagcc gcgtccccgg acgctcggcg aactggccaa gaagcaccac 120 

aagtacttcg gctcggccac cgacaacccc gagttcaccg acgccgccta tctgaagctc 180 

ctcggcagcg agttcgggca gaccaccccc ggcaacgcca tgaagtggta cgccaccgaa 240 

cccgcgcccg gcgtcttcga cttcaccgcg ggcgacgagg tcgtggcctt cgccaaggcc 300 

catcaccaga aggtccgcgg ccacaccctc gtctggcaca gccagctccc cgcctggctc 360 

accgagcgca gctggaccgc cgcggaactg cgccccgtcc tcaagaatca catccagaag 420 

gtggcccggc actacaaggg caaggtcatc cactgggacg tcgtcaacga ggccttcaac 480 
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gaggacggca cctaccgcga gtcggtcttc tacaagacgc tcggccccgg ctacatcgcc 540 

gacgccctgc gctgggccca cgaggccgac ccgcacgcca agctgtacct caacgactac 600 

aacgtcgacg ggatcggccc caagagcgac gcctactacc gcctgatcaa gcagctgaag 660 

gccgacggcg tcccggtgga gggcttcggc atccaggggc acctggcgct ccagtacggc 720 

ttccccgccg acgtcaagca gaacatgcag cgcttcgccg acctcggcgt cgaggtcgcg 780 

gtcaccgagc tcgacatccg gatgaacctc ccggcgaccc cttcgatgct cgccacccag 840 

gccacctggt acgccgacta cgtcaaggcc tgcctggagg tcaggaagtg cgtcggcgtc 900 

accatctggg actacaccga caagtactcg tggatcccct ccgtcttccc cggtgagggc 960 

gccgcgctgc cctacgacga gaacctggcg cccaagcccg cctaccacgc gatcaggaag 1020 

gtgctgggcg gatga 1035 

<210> 18 
<211> 344 
<212> PRT 
<213> Bacteria 

<220> 

<221> SIGNAL 
<222> CD... (31) 

<400> 18 

Met Ser Arg His Val lie Ala Leu Ser Ala Ala val cys Leu Ala Ala 

15 10 15 

Gly Leu Ala Ala Ala Pro Ala ser Ala Glu Pro Arg Pro Arg Thr Leu 

20 25 30 

Gly Glu Leu Ala Lys Lys His His Lys Tyr Phe Gly Ser Ala Thr Asp 

35 40 45 

Asn Pro Glu Phe Thr Asp Ala Ala Tyr Leu Lys Leu Leu Gly Ser Glu 

50 55 60 

Phe Gly Gin Thr Thr Pro Gly Asn Ala Met Lys Trp Tyr Ala Thr Glu 
65 70 75 80 

pro Ala Pro Gly val Phe Asp Phe Thr Ala Gly Asp Glu Val Val Ala 

85 90 95 

Phe Ala Lys Ala His His Gin Lys Val Arg Gly His Thr Leu Val Trp 

100 105 110 

His Ser Gin Leu Pro Ala Trp Leu Thr Glu Arg Ser Trp Thr Ala Ala 

115 120 125 

Glu Leu Arg Pro val Leu Lys Asn His lie Gin Lys val Ala Arg His 

130 135 140 

Tyr Lys Gly Lys Val lie His Trp Asp Val Val Asn Glu Ala Phe Asn 
145 150 155 160 

Glu Asp Gly Thr Tyr Arg Glu Ser Val Phe Tyr Lys Thr Leu Gly Pro 

165 170 175 

Gly Tyr lie Ala Asp Ala Leu Arg Trp Ala His Glu Ala Asp Pro His 

180 185 190 

Ala Lys Leu Tyr Leu Asn Asp Tyr Asn val Asp Gly lie Gly Pro Lys 

195 200 205 

Ser Asp Ala Tyr Tyr Arg Leu lie Lys Gin Leu Lys Ala Asp Gly Val 

210 215 220 

Pro val Glu Gly Phe Gly He Gin Gly His Leu Ala Leu Gin Tyr Gly 
225 230 235 240 

Phe Pro Ala Asp Val Lys Gin Asn Met Gin Arg Phe Ala Asp Leu Gly 

245 250 255 

val Glu Val Ala Val Thr Glu Leu Asp lie Arg Met Asn Leu Pro Ala 

260 265 270 

Thr Pro ser Met Leu Ala Thr Gin Ala Thr Trp Tyr Ala Asp Tyr Val 

275 280 285 

Lys Ala Cys Leu Glu Val Arg Lys cys Val Gly Val Thr He Trp Asp 

290 295 300 

Tyr Thr Asp Lys Tyr Ser Trp He Pro Ser Val Phe Pro Gly Glu Gly 
305 310 315 320 

Ala Ala Leu Pro Tyr Asp Glu Asn Leu Ala Pro Lys Pro Ala Tyr His 

325 330 335 

Ala lie Arg Lys val Leu Gly Gly 
340 

<210> 19 
<211> 1152 
<212> DNA 
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<213> Unknown 
<220> 

<223> obtained from an environmental sample 
<400> 19 

atgaagatgt taaaaactat tgttgtggct gtagcagcct tactatccag tcctactgct 60 

tcagccactt tacagaacct gaagcgggct cctgattcat tgaccttgaa agatgcattt 120 

gagggtaagt tttatatagg aacagcatta aaccttgatc agatatggga gcgcgatcag 180 

gctgcggtcg cggtggtcaa aacgcagttc aactccatag ttgctgagaa ttgtatgaaa 240 

agtatgtttt tgcaaccaag ggaaggtgag tttgatttta gggatgcgga ccgttttgtc 300 

gcgtttggag aaaaaaataa aatgcaaatt atcggtcata cgctgatttg gcattcgcag 360 

acaccagctt ggttttttgt cgataaaaat gggaaagagg tcacccgaga ggtacttatc 420 

gagcgcatgc ggaagcatat acaaaccgtt gtttcccgct ataagggaag ggtgtttggt 480 

tgggatgtgg tgaacgaagc catattggat aatggagaat ggcgtaaaag caaattctac 540 

cagattatcg ggccacaatt tattgaattg gccttcaaat ttgcgcatga cgcagatcca 600 

aatgcagaat tatattataa cgattattca actgctatcc ccgaaaaaag aaaggggatt 660 

atgcgcatgg tgcagcaggt aaaggctgcc ggtgggcagg tcactggaat tggtatgcag 720 

gaacacaacg cattggacaa tccaccggtc gatgaagtcg aaaaaaccat actcggattt 780 

gcaagccttg gtgcgaaggt aatggttacg gaaatggata tttcggtcct gccgcatgta 840 

cgtcccaata tgggcgcaga aataggggag cgtcatgcct acagtaaagc gatgaatccg 900 

tacgaaaaag gacttcctgt aacgaaaatg aacgagttgg gagcgagata tgtagcgttt 960 

tttaatttat atctcaaaca tcgggataaa atatcgcgtg tgacattgtg gggtgttggc 1020 

gatggagatt catggaagaa tggttggcct attcccggac gtacagacta tccattgtta 1080 

ttcgatcgga attaccaacc caaacctttt gtaaaagata ttattgcgtt gactcaaaaa 1140 

aaaaagaaat aa 1152 

<210> 20 

<211> 383 

<212> PRT 

<213> unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> CI)... (29) 



<400> 20 



Met 


Lys 


Met 


Leu 


Lys 


Thr 


He 


val 


val 


Ala 


Val 


Ala Ala Leu Leu Ser 


1 






5 










10 




15 


ser 


Pro 


Thr 


Ala 


Ser 


Ala 


Thr 


Leu 


Gin 


Asn 


Leu 


Lys Arg Ala Pro Asp 








20 










25 






30 


Ser 


Leu 


Thr 


Leu 


Lys 


Asp 


Ala 


Phe 


Glu 


Gly 


Lys 


Phe Tyr lie Gly Thr 






35 






40 






45 


Ala 


Leu 


Asn 


Leu 


ASp 


Gin 


He 


Trp 


Glu 


Arg 


Asp 


Gin Ala Ala val Ala 




50 








55 




60 


val 


val 


Lys 


Thr 


Gin 


Phe 


Asn 


ser 


He 


val 


Ala 


Glu Asn Cys Met Lys 


65 








70 










75 


80 


Ser 


Met 


Phe 


Leu 


Gin 


Pro 


Arg 


Glu 


Gly 


Glu 


Phe 


Asp Phe Arg Asp Ala 










85 






90 




95 


Asp 


Arg 


Phe 


Val 


Ala 


Phe 


Gly 


Glu 


Lys 


Asn 


Lys 


Met Gin lie lie Gly 




100 








105 




110 


His 


Thr 


Leu 


He 


Trp 


His 


Ser 


Gin 


Thr 


Pro 


Ala 


Trp Phe Phe Val Asp 






115 








120 








125 


Lys 


Asn 


Gly 


Lys 


Glu 


Val 


Thr 


Arg 


Glu 


val 


Leu 


lie Glu Arg Met Arg 


130 






135 








140 


Lys 


His 


lie 


Gin 


Thr 


Val 


Val 


Ser 


Arg 


Tyr 


Lys 


Gly Arg Val Phe Gly 


145 










150 








155 


160 


Trp 


Asp 


Val 


val 


Asn 


Glu 


Ala 


He 


Leu 


Asp 


Asn 


Gly Glu Trp Arg Lys 






165 










170 




175 


ser 


Lys 


Phe 


Tyr 


Gin 


He 


He 


Gly 


pro 


Gin 


Phe 


lie Glu Leu Ala Phe 






180 




Ala 




185 






190 


Lys 


Phe 


Ala 


His 


Asp 


Asp 


Pro 


Asn 


Ala 


Glu 


Leu Tyr Tyr Asn Asp 




195 






200 








205 


Tyr 


ser 


Thr 


Ala 


He 


Pro 


Glu 


Lys 


Arg 


Lys 


Gly 


lie Met Arg Met Val 


210 










215 




220 


Gin 


Gin 


Val 


Lys 


Ala 


Ala 


Gly 


Gly 


Gin 


Val 


Thr 


Gly lie Gly Met Gin 


225 








230 






235 


240 
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Glu His Asn Ala Leu Asp Asn Pro Pro Val Asp Glu Val Glu Lys Thr 

245 250 255 

lie Leu Gly Phe Ala ser Leu Gly Ala Lys val Met val Thr Glu Met 

260 265 270 

Asp lie Ser Val Leu Pro His Val Arg Pro Asn Met Gly Ala Glu lie 

275 280 285 

Gly Glu Arg His Ala Tyr Ser Lys Ala Met Asn Pro Tyr Glu Lys Gly 

290 295 300 

Leu Pro Val Thr Lys Met Asn Glu Leu Gly Ala Arg Tyr Val Ala Phe 
305 310 315 320 

Phe Asn Leu Tyr Leu Lys His Arg Asp Lys lie ser Arg Val Thr Leu 

325 330 335 

Trp Gly Val Gly Asp Gly Asp Ser Trp Lys Asn Gly Trp Pro lie Pro 

340 345 350 

Gly Arg Thr Asp Tyr Pro Leu Leu Phe Asp Arg Asn Tyr Gin Pro Lys 

355 360 ~ 365 

Pro Phe Val Lys Asp lie He Ala Leu Thr Gin Lys Lys Lys Lys 
370 375 380 

<210> 21 
<211> 1119 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 21 

atgcggattc actggctggg gctcagctca cgcgcaagcc tgatgacggc ggcgctcctg 60 

gctgtcacag gcaccaccaa atccgaggac tcgcccgcaa ctttgaaaga cgccttcaag 120 

gattgtttcc ggatcggggt cgcgctcaac cagcggcaat ttaccgagca agataccaac 180 

ggcgcgacgt tggtgaaacg gcagttcaac gccatctcac ccgaaaacgt gatgaagtgg 240 

gcgaacattc atccccgacc cgggcccgat gggtataact tcgaggcggc tgaccgttac 300 

gtcgagtttg gcgagaagaa cggaatgttc atcgtcggcc atacgctcgt ttggcacttc 360 

caaacgccgc gctgggtact ccagggcgat ggcactaacg cggcgacgcg cgagctgctg 420 

ctgcagcgga tgcgcgatca catccacacg gtcgtaggcc ggtacaaagg gcggatcaag 480 

gcttgggacg tggtcaacga agcgctgaac gaagatggca ctctgcggcg gtcgcagtgg 540 

taccggatca tcggcgaaga ctacatcgtc aaggctttcg aatatgcgca tgaggccgat 600 

ccgtccgcgg aattgcgata caacgattac gccatcgaga atgagcggaa gcgcgacggc 660 

gtaatcgcgc tcgtgaagaa acttcaggcg cagaaggtcc cacttggggg gctgggctcg 720 

cagacgcatg ccaacctgac ctggcctaac gccgaatcgc tggacaccgc cctcacggcc 780 

ttcaccgaac tgggtatccc gatctcaatc acggaactgg atgtgaccgc ctcgcaacgc 840 

ggtcagctca accagagcgc cgaggtgtcg cagaatggac aggcggggga gggaggcgtg 900 

gtggacgggg cgaatcagaa gctcgccgag cagtacgcca acttcttccg cgtctttctg 960 

aagcatcgca aaaacattga gctcgtgacg ttttggggcg tcacggatcg tgactcctgg 1020 

cggcgcattg gcaaaccgct gctatttaac gcagaatggc aacccaagcc ggcctttcac 1080 

gccgtcatcg ccgaggcgaa aaagatcagt gggcaatga 1119 

<210> 22 
<211> 372 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(28) 

<400> 22 

Met Arg lie His Trp Leu Gly Leu Ser Ser Arg Ala Ser Leu Met Thr 

15 10 15 

Ala Ala Leu Leu Ala val Thr Gly Thr Thr Lys ser Glu Asp ser Pro 

20 25 30 

Ala Thr Leu Lys Asp Ala Phe Lys Asp cys Phe Arg lie Gly Val Ala 

35 40 45 

Leu Asn Gin Arg Gin Phe Thr Glu Gin Asp Thr Asn Gly Ala Thr Leu 

50 55 60 

Val Lys Arg Gin Phe Asn Ala lie ser Pro Glu Asn Val Met Lys Trp 
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65 70 75 80 

Ala Asn lie His Pro Arg Pro Gly Pro Asp Gly Tyr Asn Phe Glu Ala 

85 90 95 

Ala Asp Arg Tyr Val Glu Phe Gly Glu Lys Asn Gly Met Phe lie val 

100 105 110 

Gly His Thr Leu Val Trp His Phe Gin Thr Pro Arg Trp Val Leu Gin 

115 120 125 

Gly Asp Gly Thr Asn Ala Ala Thr Arg Glu Leu Leu Leu Gin Arg Met 

130 135 " 140 

Arg Asp His lie His Thr Val val Gly Arg Tyr Lys Gly Arg lie Lys 
145 150 155 ~ 160 

Ala Trp Asp val val Asn Glu Ala Leu Asn Glu Asp Gly Thr Leu Arg 

165 170 175 

Arg Ser Gin Trp Tyr Arg lie lie Gly Glu Asp Tyr lie val Lys Ala 

180 185 190 

Phe Glu Tyr Ala His Glu Ala Asp pro ser Ala Glu Leu Arg Tyr Asn 

195 200 205 

Asp Tyr Ala lie Glu Asn Glu Arg Lys Arg Asp Gly val lie Ala Leu 

210 215 220 

Val Lys Lys Leu Gin Ala Gin Lys Val Pro Leu Gly Gly Leu Gly Ser 
225 230 235 240 

Gin Thr His Ala Asn Leu Thr Trp Pro Asn Ala Glu Ser Leu Asp Thr 

245 250 255 

Ala Leu Thr Ala Phe Thr Glu Leu Gly lie Pro lie Ser lie Thr Glu 

260 265 270 

Leu Asp val Thr Ala Ser Gin Arg Gly Gin Leu Asn Gin Ser Ala Glu 

275 280 285 

Val Ser Gin Asn Gly Gin Ala Gly Glu Gly Gly Val val Asp Gly Ala 

290 295 300 

Asn Gin Lys Leu Ala Glu Gin Tyr Ala Asn Phe Phe Arg val Phe Leu 
305 310 315 " 320 

Lys His Arg Lys Asn lie Glu Leu Val Thr Phe Trp Gly Val Thr Asp 

325 330 335 

Arg Asp ser Trp Arg Arg lie Gly Lys Pro Leu Leu Phe Asn Ala Glu 

340 345 350 

Trp Gin Pro Lys Pro Ala Phe His Ala Val He Ala Glu Ala Lys Lys 

355 360 365 

lie ser Gly Gin 
370 

<210> 23 
<211> 1137 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 23 

atgaggacaa aacaagtttt taaattaacc acgctcgctt tattattaac agcagttgtt 60 

agtagctgtt ctgccccaaa agcggcaaaa gaagatacgc ttaaagatgc cctccaggga 120 

aaattcttta ttggtgctgc tgttaatgtt gaccaaatgg caggaaagga ttctcttgca 180 

attgaagttg ttaaaaagaa ctttagctca attgtggccg agaattgcat gaaaatggaa 240 

aacatccatc ctgtaaaagg tgaatttttc ttcgatgaag ccgatgcata tgttgaattt 300 

ggcgaaaaaa acaacatgaa aatcattggt cacacattga tttggcattc acaagccgcc 360 

aaatgggcat ttgttgatga tgaaggcaaa gatgtatcgc gcgaagaatt aattgaacgg 420 

atgcgcaacc acatccatac cattgtaggc cgctataaag gtcgtgtaca tggctgggac 480 

gttgttaatg aggctattct ggataacggc gaatggcgtc agagcaaatg gtataccatt 540 

attggacccg aatttgttca gcttgctttt gagtttgccc acgaagccga ccccaacgct 600 

gaattgtatt acaacgacta caacgagtgg attccggcta aaagagacgg catttacaac 660 

atggttaagg atttaatcga caaaggcgtt aaagttgatg gaattggcct acagggtcac 720 

attgctcttg actctcccag catcgaactt tacgaagaag ccattgtaaa atatgcaagt 780 

ctgggtgtgc aaacaatggt taccgaactc gatatcactg ttttaccatg gccatcgcag 840 

caagttacag ccgatatatc ttttagtgca gagctatcaa ccgaatacaa tccatttgtt 900 

aatggtttac ccgattcggt tagcgttgaa cttaccaacc gttttgccag tttcttcgag 960 

ttgtttttga aacatcagga taaaattgac cgcgttactc tatggggtgt acacgatggt 1020 

caatcatgga aaaacaactg gcccatcagg ggacgtaaag attatccgtt gttattcgac 1080 

aggcaatatc agtccaaacc tgccgttcag cgcataatcg aattggctaa acaataa 1137 
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<210> 24 
<211> 378 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(29) 

<400> 24 

Met Arg Thr Lys Gin val Phe Lys Leu Thr Thr Leu Ala Leu Leu Leu 

15 10 15 

Thr Ala Val Val Ser Ser Cys ser Ala Pro Lys Ala Ala Lys Glu Asp 

20 25 30 

Thr Leu Lys Asp Ala Leu Gin Gly Lys Phe Phe lie Gly Ala Ala Val 

35 40 45 

Asn Val Asp Gin Met Ala Gly Lys Asp Ser Leu Ala He Glu Val Val 

50 .55 60 

Lys Lys Asn Phe ser Ser lie Val Ala Glu Asn cys Met Lys Met Glu 
65 70 75 80 

Asn lie His Pro val Lys Gly Glu Phe Phe Phe Asp Glu Ala Asp Ala 

85 90 95 

Tyr Val Glu Phe Gly Glu Lys Asn Asn Met Lys He He Gly His Thr 

100 105 110 

Leu lie Trp His Ser Gin Ala Ala Lys Trp Ala Phe val Asp Asp Glu 

115 120 125 

Gly Lys Asp Val Ser Arg Glu Glu Leu lie Glu Arg Met Arg Asn His 

130 135 140 

lie His Thr lie Val Gly Arg Tyr Lys Gly Arg Val His Gly Trp Asp 
145 150 155 160 

val Val Asn Glu Ala lie Leu Asp Asn Gly Glu Trp Arg Gin ser Lys 

165 170 175 

Trp Tyr Thr lie lie Gly Pro Glu Phe val Gin Leu Ala Phe Glu Phe 

180 185 190 

Ala His Glu Ala Asp Pro Asn Ala Glu Leu Tyr Tyr Asn Asp Tyr Asn 

195 200 205 

Glu Trp lie Pro Ala Lys Arg Asp Gly lie Tyr Asn Met Val Lys Asp 

210 215 220 

Leu lie Asp Lys Gly Val Lys Val Asp Gly lie Gly Leu Gin Gly His 
225 230 235 240 

lie Ala Leu Asp ser Pro ser lie Glu Leu Tyr Glu Glu Ala lie val 

245 250 255 

Lys Tyr Ala Ser Leu Gly Val Gin Thr Met val Thr Glu Leu Asp lie 

260 265 270 

Thr val Leu Pro Trp Pro Ser Gin Gin Val Thr Ala Asp lie Ser Phe 

275 280 285 

ser Ala Glu Leu Ser Thr Glu Tyr Asn Pro Phe Val Asn Gly Leu Pro 

290 295 300 

Asp ser val ser Val Glu Leu Thr Asn Arg Phe Ala ser Phe Phe Glu 
305 310 315 320 

Leu Phe Leu Lys His Gin Asp Lys lie Asp Arg Val Thr Leu Trp Gly 

325 330 335 

Val His Asp Gly Gin Ser Trp Lys Asn Asn Trp Pro lie Arg Gly Arg 

340 345 350 

Lys Asp Tyr Pro Leu Leu Phe Asp Arg Gin Tyr Gin Ser Lys Pro Ala 

355 360 365 

Val Gin Arg lie lie Glu Leu Ala Lys Gin 
370 375 

<210> 25 
<211> 978 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
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<400> 25 

gtggatccaa agaattcctt acgcgcctta gctcaaaagc gaggaattgg gtttgggacg 60 

gcagtttggg ttgagcctct gtctaacgat tcgagatatc ggacggtgtt ggcgcaggag 120 

ttcaatatgg tgacgccaga gaatgagatg aagtttgagc cgacgcatcc agaacgggag 180 

cgctacgatt ttacagcagc cgataccctt gttgactttg ccaagaacca taacatgcag 240 

gtgcgcggac ataccctggt ttggcatgaa agtctccccg attggctaac gactcaaacg 300 

tggacgcgtg aggagttgat gtccatctta gaagaacaca tcaatacagt tgtcgatcgc 360 

tatcgggggc aattagttgc ctgggatgtg gtgaatgaag cgatcgccaa cgataaaaac 420 

gcactcagag atacgatttg gctgcgaaca atcgggccag agtatataga gaaggcattt 480 

cgctgggcgc atgcagccga ccctcaagca cgtttatttt acaacgatta tggcggcgag 540 

gaagtggggg gaaagtctga ggccatctat ggcatgctta aagatttgct gcaacagggt 600 

gtcccgattc acggggttgg cttgcaaatg cacgttagta taaaaaaccc tcccaatccc 660 

gaaaaagtgg cggcaaatat caagcgcctg aacgatctgg gattggaagt gcatataact 720 

gagatggatg tgaaaacctg ggatggcatc ggtacgaagc agcaacgact tgcggctcag 780 

gcacaagtgt atcggaacat gatgcaggtg tgtttggaag ctgagaactg taaggcgttt 840 

tcgttgtggg gggtaagcga tcgctattct tggattcccc ggatttttaa gaagccggat 900 

gcaccactga tttttgatga tttagggcgt ccgaaacccg cttacaatgc cctgaaagaa 960 

gtcctcaagc ggcgttaa 978 

<210> 26 
<211> 325 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 26 

Val Asp Pro Lys Asn Ser Leu Arg Ala Leu Ala Gin Lys Arg Gly lie 

1 5 " 10 15 

Gly Phe Gly Thr Ala Val Trp Val Glu Pro Leu Ser Asn Asp Ser Arg 

20 25 30 

Tyr Arg Thr Val Leu Ala Gin Glu Phe Asn Met Val Thr Pro Glu Asn 

35 40 45 

Glu Met Lys Phe Glu Pro Thr His Pro Glu Arg Glu Arg Tyr Asp Phe 

50 55 60 

Thr Ala Ala Asp Thr Leu val Asp Phe Ala Lys Asn His Asn Met Gin 
65 70 75 80 

Val Arg Gly His Thr Leu Val Trp His Glu ser Leu Pro Asp Trp Leu 

85 90 95 

Thr Thr Gin Thr Trp Thr Arg Glu Glu Leu Met Ser lie Leu Glu Glu 

100 105 110 

His lie Asn Thr Val Val Asp Arg Tyr Arg Gly Gin Leu val Ala Trp 

115 120 125 

Asp val Val Asn Glu Ala lie Ala Asn Asp Lys Asn Ala Leu Arg Asp 

130 135 140 

Thr lie Trp Leu Arg Thr lie Gly Pro Glu Tyr lie Glu Lys Ala Phe 
145 150 155 160 

Arg Trp Ala His Ala Ala Asp Pro Gin Ala Arg Leu Phe Tyr Asn Asp 

165 170 175 

Tyr Gly Gly Glu Glu val Gly Gly Lys ser Glu Ala lie Tyr Gly Met 

180 185 190 

Leu Lys Asp Leu Leu Gin Gin Gly val Pro lie His Gly val Gly Leu 

195 200 205 

Gin Met His Val ser lie Lys Asn Pro Pro Asn Pro Glu Lys Val Ala 

210 215 220 

Ala Asn lie Lys Arg Leu Asn Asp Leu Gly Leu Glu val His He Thr 
225 230 235 240 

Glu Met Asp Val Lys Thr Trp Asp Gly lie Gly Thr Lys Gin Gin Arg 

245 250 255 

Leu Ala Ala Gin Ala Gin Val Tyr Arg Asn Met Met Gin Val Cys Leu 

260 265 270 

Glu Ala Glu Asn Cys Lys Ala Phe ser Leu Trp Gly Val ser Asp Arg 

275 280 285 

Tyr ser Trp lie Pro Arg lie Phe Lys Lys Pro Asp Ala Pro Leu lie 

290 295 300 

Phe Asp Asp Leu Gly Arg Pro Lys Pro Ala Tyr Asn Ala Leu Lys Glu 
305 310 315 320 

val Leu Lys Arg Arg 
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<210> 27 
<211> 1173 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 27 

atgaaatcct taacaaatca atccttcatg aaactcataa tctgtctggc attgccagtc 60 

gcactactca gcatttcatg caaaaaaccc gccgaaccac tgaaaccggt tgaaggctta 120 

aaagacagct tcaaagacaa gtttctcatg ggtgtggcgc tgaataaagc acagattctg 180 

ggaagagata cattggtaca tgcttttaca gtacagcatt ttaattccat tactgcagaa 240 

aacgaaatga agtgggaacg catccacccg cagcctgatg tatatgattt cacggttccg 300 

gacagcctga ttgcttttgg cgaacgcaac ggcatgttta tagtcgggca tacactcgta 360 

tggcactccc aggtgcccga ttgggttttc accgatgaga agggaaagcc tctgacccgc 420 

gatgctctgc tccaacgcat gaaggatcat atttatgccg ttgtcggccg gtataagggc 480 

aaggtggatg gctgggatgt ggtaaatgaa gcattggatg aagacggaca gctgcgcaaa 540 

tccaggtggc atgaaatcat cggtgatgat tacattcaga aagcctttga gttcacccgg 600 

gaggcagatc ccggtgcaga gctttattac aatgattaca acatagaact caaaaaaaag 660 

cgggagggtg ctgtcaggct gctacaggaa ctgcagcaaa aaggcattaa aatcgacgga 720 

gtgggcattc agggacattg gcacctgcac tcacctgatc tgcaagagat tgattcaagt 780 

cttcaggcat acggacaact tggtctgaag gtcatgatca ccgaactgga tgttaacgtc 840 

attcccgaac cttcaggtat tattggcgcc gatgttgcac agcgggcgga ttatcagagc 900 

cagctgaatc catggcctga aagttttccc gattccatgc agcaggttct ggccagccgg 960 

tatgccgaac tgttcggatt gttcctgaag cacagcgata aggtaagccg ggtgaccttc 1020 

tggggaattc acgatggcta ttcctggaag aacaactggc caataccggg ccgaacaact 1080 

tatcccctcc tttttgaccg gaattaccag cctaaacctg cgtatgatgc tgtcattgaa 1140 

ttgaccaaaa tacagccgga agccagtaac tga 1173 

<210> 28 

<211> 390 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> CD... (27) 

<400> 28 

Met Lys ser Leu Thr Asn Gin Ser Phe Met Lys Leu lie lie Cys Leu 

15 10 15 

Ala Leu Pro Val Ala Leu Leu Ser lie Ser cys Lys Lys Pro Ala Glu 

20 25 30 

Pro Leu Lys Pro Val Glu Gly Leu Lys Asp ser Phe Lys Asp Lys Phe 

35 40 45 

Leu Met Gly val Ala Leu Asn Lys Ala Gin lie Leu Gly Arg Asp Thr 

50 55 60 

Leu Val His Ala Phe Thr Val Gin His Phe Asn Ser lie Thr Ala Glu 
65 70 75 80 

Asn Glu Met Lys Trp Glu Arg lie His Pro Gin Pro Asp Val Tyr Asp 

85 90 95 

Phe Thr Val Pro Asp ser Leu lie Ala Phe Gly Glu Arg Asn Gly Met 

100 105 110 

Phe lie Val Gly His Thr Leu Val Trp His Ser Gin Val Pro Asp Trp 

115 120 125 

Val Phe Thr Asp Glu Lys Gly Lys Pro Leu Thr Arg Asp Ala Leu Leu 

130 135 140 

Gin Arg Met Lys Asp His lie Tyr Ala Val val Gly Arg Tyr Lys Gly 
145 150 155 160 

Lys val Asp Gly Trp Asp val val Asn Glu Ala Leu Asp Glu Asp Gly 

165 170 175 

Gin Leu Arg Lys Ser Arg Trp His Glu lie lie Gly Asp Asp Tyr He 

180 185 190 

Gin Lys Ala Phe Glu Phe Thr Arg Glu Ala Asp Pro Gly Ala Glu Leu 
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195 

Tyr Tyr Asn Asp 
210 

Val Arg Leu Leu 
225 

val Gly lie Gin 

lie Asp ser ser 
260 

lie Thr Glu Leu 

275 

Gly Ala Asp Val 
290 

Trp Pro Glu Ser 
305 

Tyr Ala Glu Leu 

Arg Val Thr Phe 
340 

Trp Pro lie Pro 
355 

Tyr Gin Pro Lys 
370 

Gin Pro Glu Ala 
385 

<210> 29 
<211> 2331 
<212> DNA 
<213> Archaea 



Tyr Asn lie 
215 

Gin Glu Leu 

230 
Gly His Trp 
245 

Leu Gin Ala 

Asp Val Asn 

Ala Gin Arg 
295 

Phe Pro Asp 

310 
Phe Gly Leu 
325 

Trp Gly lie 

Gly Arg Thr 

Pro Ala Tyr 
375 

ser Asn 
390 



200 

Glu Leu 

Gin Gin 

His Leu 

Tyr Gly 
265 
Val lie 
280 

Ala Asp 

Ser Met 

Phe Leu 

His Asp 
345 
Thr Tyr 
360 

Asp Ala 



Lys Lys 

Lys Gly 
235 
His Ser 
250 

Gin Leu 

Pro Glu 

Tyr Gin 

Gin Gin 
315 
Lys His 
330 

Gly Tyr 
Pro Leu 
Val He 



205 
Lys Arg 
220 

lie Lys 

Pro Asp 

Gly Leu 

Pro Ser 
285 
ser Gin 
300 

Val Leu 

ser Asp 

ser Trp 

Leu Phe 
365 
Glu Leu 
380 



Glu 


Gly 


Ala 


He 


Asp 


Gly 




240 


Leu 


Gin 


Glu 




255 




Lys 


Val 


Met 


270 






Gly 


He 


He 


Leu 


Asn 


Pro 


Ala 


Ser 


Arg 






320 


Lys 


val 


Ser 


335 




Lys 


Asn 


Asn 


350 






Asp 


Arg 


Asn 


Thr 


Lys 


He 



<400> 29 

atgacgatgc 

gctactgtac 

gttaccgtgg 

gtggatgacg 

gaaaaagaga 

ctgccgtcca 

agcatcaccg 

aagctgaagt 

atggttataa 

gacgaaaacg 

cacaccgttg 

gtcgatccga 

gactacatag 

tacaacgact 

gatctcaaag 

acagacatca 

attcacatca 

gcaccgagga 

aagaagcaca 

tggagagcaa 

ctcgcttact 

atctccgaag 

ccgatagaga 

gacagcacga 

ggagtggcca 

acctacgttg 

aagaaattcg 

ggtgtggagt 

aagtggtaca 

acgctgaagc 

ggagagatcg 

ggatcgcttg 

tacgtacttg 

caggattccg 

gacgacgcgc 

ccagcgaggt 

atcaagtgga 



agagaaagta 
catctggtca 
aagatctcac 
tgaagatagt 
tacctgctct 
aggtcttcct 
cagaaaacga 
tcaggtttga 
gaggtcacac 
gaaacctcct 
tcggacactt 
accagccgga 
aactcgcctt 
acaacacatt 
agaagggact 
aacagatcga 
cagaactcga 
cggcactcat 
gcaacgtgat 
caagaagaaa 
gggcgatagt 
gcgaagcagt 
tccttgacga 
tctacatcta 
tattcatcaa 
tgctgtggac 
ttgggcctgg 
tcaagaaaga 
gctggagcga 
tcgaaggaat 
atgagatctg 
acaagaatgc 
cgatcgtgaa 
tggagatctt 
agttcagggt 
tcaagacagc 
agacgatcaa 



ctcatccgac 
gtgggtacag 
gctttacttc 
ggatacaact 
gaaagaagta 
caacccgaag 
gatgaaaccg 
aacagcagac 
actggtgtgg 
ctccaaagaa 
caaaggaaaa 
tggactgaga 
caagttcgca 
cgatcccaga 
catcgatggc 
agaggccatc 
tatgagtgtc 
cgaacaggct 
cacgaacgtc 
cgactggccg 
ggcacctgag 
ggtagtgggg 
agaagggaac 
cggagaggta 
cccgaacaac 
gaactggaag 
ctttagaaga 
cagctacata 
cacgacgaac 
aatggtagcg 
gaacacgaca 
gacagcgaaa 
agagcccgtt 
cgtggatgag 
gaactacatg 
ggtgaagctg 
gccaacaccg 



gcgaacacac agtatgagtg 
ctctctggaa cgtacacgat 
gaatctcaaa atccaaccct 
tccgcagaga taaagattga 
ctgaaagatt acttcaaagt 
gacatagaac tcatcacgaa 
gatagtctgc tcgcgggcat 
aaatacattc agttcgtcga 
cacaaccaga cacccgactg 
gcgatgacgg aaagactcaa 
gtctacgcat gggacgtggt 
agatcaacct ggtaccagat 
agagaggcag atccagatgc 
aagagagaca tcatctacaa 
ataggaatgc agtgtcacat 
aaaaagttca gcaccatacc 
tacagagatt ccagttccaa 
cacaaaatga tgcagctctt 
acattctggg gtctcaagga 
ctcatcttcg acaaagatca 
gtccttccac cacttccaaa 
atgatggacg actcgtacct 
gtgaaggcaa cgatcagggc 
caggacaaga caaagaaacc 
gaaagaacac cctatctgca 
acggaggtca acagagaaga 
tacagcttcg agatgtcgat 
ggatttgacg ttgcggtgat 
agccagaaga cgaacacgat 
acagcaaaat acggaacacc 
gaggagatag agacgaaagc 
gtgagggtgc tgtgggacga 
ctgaacaaag acaacagcaa 
aacaaccaca agacaggata 
aacgagcaga cctttggaac 
atcgaaggag gatacatagt 
aacacagtga taggattcaa 
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gataaaatca 
cccggccgga 
tgagttctac 
aatggaacct 
cggagttgca 
acacttcaac 
cgaaaacggt 
ggaaaacggc 
gttcttcaaa 
agagtacatc 
gaacgaagcg 
catggggcct 
aaaactcttc 
cctcgtgaag 
cagtcttgca 
cggtatagaa 
ctacccagag 
tgagatcttc 
cgattactc'c 
ccaggcgaaa 
agaaagcagg 
gatgtcgaag 
agtgtggaaa 
agcagaagac 
gcctgatgac 
cgtacaggtg 
cacgataccg 
agacgacggg 
gaactacgga 
ggtcatcgat 
ggtggctatg 
gaactacctg 
cccgtgggag 
ctacgaagac 
gggaggaagt 
tgaggcagcg 
catccaggtg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
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aacgatgcga acgagaaagg gcagagggtc ggtatcatct 
aacagctggc aagatccttc aaagttcggt aacctcagac 



cctggagcga tcccacaaac 
tcatcaagtg a 



2280 
2331 



<210> 30 
<211> 776 
<212> PRT 
<213> Archaea 

<400> 30 

Met Thr Met Gin Arg Lys Tyr Ser ser Asp Ala Asn Thr Gin Tyr Glu 

15 10 15 

Trp lie Lys Ser Ala Thr Val Pro Ser Gly Gin Trp Val Gin Leu Ser 

20 25 30 

Gly Thr Tyr Thr lie Pro Ala Gly Val Thr Val Glu Asp Leu Thr Leu 

35 40 45 

Tyr Phe Glu Ser Gin Asn Pro Thr Leu Glu Phe Tyr val Asp Asp val 

50 55 60 

Lys lie val Asp Thr Thr ser Ala Glu lie Lys lie Glu Met Glu Pro 
65 70 75 80 

Glu Lys Glu lie Pro Ala Leu Lys Glu Val Leu Lys Asp Tyr Phe Lys 

85 90 95 

Val Gly val Ala Leu Pro Ser Lys val Phe Leu Asn Pro Lys Asp lie 

100 105 110 

Glu Leu lie Thr Lys His Phe Asn Ser lie Thr Ala Glu Asn Glu Met 

115 120 125 

Lys Pro Asp ser Leu Leu Ala Gly lie Glu Asn Gly Lys Leu Lys Phe 

130 135 140 

Arg Phe Glu Thr Ala Asp Lys Tyr lie Gin Phe val Glu Glu Asn Gly 
145 150 155 160 

Met Val lie Arg Gly His Thr Leu val Trp His Asn Gin Thr Pro Asp 

165 170 175 

Trp Phe Phe Lys Asp Glu Asn Gly Asn Leu Leu ser Lys Glu Ala Met 

180 185 190 

Thr Glu Arg Leu Lys Glu Tyr lie His Thr Val Val Gly His Phe Lys 

195 200 205 

Gly Lys Val Tyr Ala Trp Asp Val Val Asn Glu Ala Val Asp Pro Asn 

210 215 220 

Gin Pro Asp Gly Leu Arg Arg ser Thr Trp Tyr Gin lie Met Gly Pro 
225 230 235 240 

Asp Tyr lie Glu Leu Ala Phe Lys Phe Ala Arg Glu Ala Asp Pro Asp 

245 250 255 

Ala Lys Leu Phe Tyr Asn Asp Tyr Asn Thr Phe Asp Pro Arg Lys Arg 

260 265 270 

Asp lie lie Tyr Asn Leu Val Lys Asp Leu Lys Glu Lys Gly Leu lie 

275 280 285 

Asp Gly lie Gly Met Gin cys His lie Ser Leu Ala Thr Asp lie Lys 

290 295 300 

Gin lie Glu Glu Ala lie Lys Lys Phe Ser Thr lie Pro Gly lie Glu 
305 310 315 320 

lie His lie Thr Glu Leu Asp Met Ser Val Tyr Arg Asp Ser Ser Ser 

325 330 335 

Asn Tyr Pro Glu Ala Pro Arg Thr Ala Leu He Glu Gin Ala His Lys 

340 345 350 

Met Met Gin Leu Phe Glu lie Phe Lys Lys His ser Asn Val lie Thr 

355 360 365 

Asn Val Thr Phe Trp Gly Leu Lys Asp Asp Tyr ser Trp Arg Ala Thr 

370 375 380 

Arg Arg Asn Asp Trp Pro Leu He Phe Asp Lys Asp His Gin Ala Lys 
385 390 395 400 

Leu Ala Tyr Trp Ala lie val Ala Pro Glu val Leu Pro Pro Leu Pro 

405 410 415 

Lys Glu ser Arg lie Ser Glu Gly Glu Ala val val Val Gly Met Met 

420 425 430 

Asp Asp Ser Tyr Leu Met Ser Lys Pro lie Glu lie Leu Asp Glu Glu 

435 440 445 

Gly Asn Val Lys Ala Thr lie Arg Ala val Trp Lys Asp ser Thr lie 

450 455 460 

Tyr He Tyr Gly Glu val Gin Asp Lys Thr Lys Lys pro Ala Glu Asp 



465 



470 




480 
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Gly Val Ala lie Phe lie Asn Pro Asn Asn Glu Arg Thr Pro Tyr Leu 

485 490 495 

Gin Pro Asp Asp Thr Tyr Val Val Leu Trp Thr Asn Trp Lys Thr Glu 

500 505 510 

val Asn Arg Glu Asp val Gin Val Lys Lys Phe Val Gly Pro Gly Phe 

515 520 525 

Arg Arg Tyr Ser Phe Glu Met ser lie Thr lie Pro Gly val Glu Phe 

530 535 540 

Lys Lys Asp Ser Tyr lie Gly Phe Asp Val Ala Val lie Asp Asp Gly 
545 550 555 560 

Lys Trp Tyr Ser Trp Ser Asp Thr Thr Asn Ser Gin Lys Thr Asn Thr 

565 570 575 

Met Asn Tyr Gly Thr Leu Lys Leu Glu Gly lie Met val Ala Thr Ala 

580 585 590 

Lys Tyr Gly Thr Pro Val lie Asp Gly Glu lie Asp Glu lie Trp Asn 

595 600 605 

Thr Thr Glu Glu lie Glu Thr Lys Ala Val Ala Met Gly Ser Leu Asp 

610 615 620 

Lys Asn Ala Thr Ala Lys Val Arg val Leu Trp Asp Glu Asn Tyr Leu 
625 630 635 640 

Tyr val Leu Ala lie val Lys Glu Pro val Leu Asn Lys Asp Asn Ser 

645 650 655 

Asn Pro Trp Glu Gin Asp Ser Val Glu lie Phe Val Asp Glu Asn Asn 

660 665 670 

His Lys Thr Gly Tyr Tyr Glu Asp Asp Asp Ala Gin Phe Arg Val Asn 

675 680 685 

Tyr Met Asn Glu Gin Thr Phe Gly Thr Gly Gly ser Pro Ala Arg Phe 

690 695 700 

Lys Thr Ala Val Lys Leu lie Glu Gly Gly Tyr lie val Glu Ala Ala 
705 710 715 720 

lie Lys Trp Lys Thr lie Lys Pro Thr Pro Asn Thr Val lie Gly Phe 

725 730 735 

Asn lie Gin Val Asn Asp Ala Asn Glu Lys Gly Gin Arg Val Gly lie 

740 745 750 

lie ser Trp ser Asp Pro Thr Asn Asn ser Trp Gin Asp Pro ser Lys 

755 760 765 

Phe Gly Asn Leu Arg Leu lie Lys 
770 ~ 775 

<210> 31 

<211> 1134 

<212> DNA 

<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 31 

gtggaaaccg tcggagcacc ggagctgagc tatgaaatcc ggaatttccg ggtggtggca 60 

ccggacggag tgccggatat acagcccaca gccgcaccgg aagcgcaggc tgttccggaa 120 

ggggagatgc cttccctgaa ggatgtatac gcgggcaaat tcgacttcgg tacggcgctg 180 

ccccggaatg cattcaatga tatccagctg ctgagactgg tgaaggacca gttcaacatc 240 

ctgacaccgg aaaatgagat gaaaccggat gcaatcctgg atgtgtacgg cagcaaaaaa 300 

ctggcggaaa aggacgagac agcggtggct gtccggtttg aagcatgcaa gacgctgctt 360 

cggttcgcac agtccaacgg cctgaaggtg cacggccata cgctgctgtg gcacaaccag 420 

accccggaag cccttttcca cgaaggttat gacaccacca agccgatggc cggccgggaa 480 

gtgatgttgg gccggatgga gaattacatc cgcgaagtgc tgacctggac cgaagaaaat 540 

tatccgggcg tgatcgtttc ctgggacgtg gtgaatgaag caatcgacga cggaacgaac 600 

cagctgcgca ccggtgccaa ctggtataag acggtcggac cggactacct ggcacgcgcg 660 

tttgaatatg cccggaaata cgcggcggaa ggcgtgctgc tgtactacaa cgattacaat 720 

accgcatacg gcggtaagct gtatgggatt gtggatctgc tggagagcct gattgccgag 780 

ggcaatattg acggatacgg attccagatg caccacagcc tgggagaacc ttccatggat 840. 

atgattaccc gggcagtaga gaaaatagcc tcgctgggac tccggctgcg tgtgagcgaa 900 

ctggacatca acgccggcaa ggcgacagag aaaaatttcg aagcccagaa gaacaagtac 960 

aaacaggtga tgaagctgat gctccggttc aaggaccaga ctgaagcggt ccaggtgtgg 1020 

ggcgtgacgg acatcatgag ctggcgcagg gacggatatc cgctgctgtt tgacaagaac 1080 

atgaatccga aacccgcgtt cttcggtgtg atcgaagccg gaatggaaga ctga 1134 

<210> 32 
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<211> 377 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 32 

val Glu Thr Val Gly Ala Pro Glu Leu Ser Tyr Glu lie Arg Asn Phe 

15 10 15 

Arg val Val Ala Pro Asp Gly Val Pro Asp lie Gin Pro Thr Ala Ala 

20 25 30 

Pro Glu Ala Gin Ala Val Pro Glu Gly Glu Met Pro ser Leu Lys Asp 

35 40 45 

val Tyr Ala Gly Lys Phe Asp Phe Gly Thr Ala Leu Pro Arg Asn Ala 

50 55 60 

Phe Asn Asp lie Gin Leu Leu Arg Leu Val Lys Asp Gin Phe Asn lie 
65 70 75 80 

Leu Thr Pro Glu Asn Glu Met Lys Pro Asp Ala lie Leu Asp val Tyr 

85 90 95 

Gly ser Lys Lys Leu Ala Glu Lys Asp Glu Thr Ala Val Ala Val Arg 

100 105 110 

Phe Glu Ala Cys Lys Thr Leu Leu Arg Phe Ala Gin Ser Asn Gly Leu 

115 120 125 

Lys val His Gly His Thr Leu Leu Trp His Asn Gin Thr Pro Glu Ala 

130 135 140 

Leu Phe His Glu Gly Tyr Asp Thr Thr Lys Pro Met Ala Gly Arg Glu 
145 150 155 160 

Val Met Leu Gly Arg Met Glu Asn Tyr lie Arg Glu Val Leu Thr Trp 

165 170 175 

Thr Glu Glu Asn Tyr Pro Gly val lie Val Ser Trp Asp val Val Asn 

180 185 190 

Glu Ala He Asp Asp Gly Thr Asn Gin Leu Arg Thr Gly Ala Asn Trp 

195 200 205 

Tyr Lys Thr val Gly Pro Asp Tyr Leu Ala Arg Ala Phe Glu Tyr Ala 

210 215 220 

Arg Lys Tyr Ala Ala Glu Gly Val Leu Leu Tyr Tyr Asn Asp Tyr Asn 
225 230 235 240 

Thr Ala Tyr Gly Gly Lys Leu Tyr Gly lie Val Asp Leu Leu Glu ser 

245 250 255 

Leu lie Ala Glu Gly Asn lie Asp Gly Tyr Gly Phe Gin Met His His 

260 265 270 

ser Leu Gly Glu Pro ser Met Asp Met lie Thr Arg Ala val Glu Lys 

275 280 285 

lie Ala ser Leu Gly Leu Arg Leu Arg val Ser Glu Leu Asp lie Asn 

290 295 " 300 

Ala Gly Lys Ala Thr Glu Lys Asn Phe Glu Ala Gin Lys Asn Lys Tyr 
305 310 315 320 

Lys Gin Val Met Lys Leu Met Leu Arg Phe Lys Asp Gin Thr Glu Ala 

325 330 335 

val Gin Val Trp Gly Val Thr Asp lie Met Ser Trp Arg Arg Asp Gly 

340 345 350 

Tyr Pro Leu Leu Phe Asp Lys Asn Met Asn Pro Lys Pro Ala Phe Phe 

355 360 365 

Gly val lie Glu Ala Gly Met Glu Asp 
370 375 

<210> 33 

<211> 1815 

<212> DNA 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 33 

atggttcgca aaaaactatt ttatatcgtc gcgttaatgc tgatgttcgg cgcaagtttt 60 
acttccgctc aggacgcgga attttccctg cgcggtttag ccgagcgcaa taacttttat 120 
gttggagcag ccgtttatac cactcatctg aatgatcctg tccatgttga aacactggca 180 
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cgagaattca 
caaggtcaat 
atggcgatac 
ggcgagtaca 
ggccgttaca 
ggcggaacac 
gccttccagt 
acggaaggca 
cgtggaattc 
ccagaccaga 
accgaggtcg 
ggcgattacc 
acctggggcg 
aacctatcgg 
gcggtgctgg 
cttgcggcga 
aatctcagtc 
ataagcatca 
gacagcggcc 
gacaaaacca 
cacgacccgg 
ggcgatctaa 
aacatcgaca 
caggtaaaag 
ctcatgaccg 
catctcaatg 
ctggatacgc 
aacataaatc 



atatgctcac 
ttgactttcg 
acggtcatgc 
cccgtgacga 
aaggccgtat 
tgcgcgatac 
tcgctcatga 
tgaaccctaa 
cgattcacgg 
ttgctcggaa 
atattcgata 
atcgcctgat 
tgaccgataa 
ttgaaccgct 
actcactagc 
tgatcggcgg 
aggaagcgcc 
cagttgacgg 
ccaccgtacc 
atctatactt 
ctactgcctg 
ctaacaccgc 
acgataatcc 
cgattgtcgt 
atgtctggac 
gctcacgcac 
tagatcagtc 
tctaa 



gcctgaacag 
gagtttcgat 
gctggtctgg 
agccattggt 
tccgatttgg 
gccatggcgg 
agccgacccg 
atcggacgcc 
ggttgggctg 
cgtcgcgcgg 
ttccggcgag 
ggacgtttgc 
atatacctgg 
gctttttgac 
gcgaagagcg 
cacagtccaa 
ggacgccgtt 
cgaagccaac 
tcaggataac 
ccttgcagag 
gtatcaggag 
ctaccaaccc 
cggtgcaccg 
caaaaccgat 
gattgaaccg 
accggatgcc 
ctatagcaat 



caggccaaac 
cgtttagtcg 
catagctgca 
ctgctgcgcg 
gacgtcgtca 
cagttaattg 
gatgcgctgc 
atgtacgaga 
caatcccatt 
cttggcgaac 
gcgacagata 
ctcggtaacg 
ttgcggggcg 
gatgactatg 
ggcgaaaccc 
acggtcgaaa 
cctggtgtga 
gattgggaac 
gacacgacaa 
gttacggaca 
gactcggttg 
ggcgtcgccc 
atcatcgggg 
accgggtatc 
aaacaagggg 
gaccgagaca 
cccagcctgt 



attgtgagtt 
ccttcgccga 
caccgcaatg 
actcgattat 
atgaaggcat 
gcgatgatta 
tgttttacaa 
tggtgagcga 
tcatattagg 
tcggtttaca 
atatcctcca 
acgcctgtac 
cgaacctggg 
aacccaagcc 
ccgttttgag 
ttcccccgcc 
tctattacgc 
gcattccgcg 
tgacatttgc 
gccaggtgtc 
agttttacct 
aaatcggtat 
gcggcaacag 
tggtcgaggc 
ctgtactcgg 
ccaagttgat 
ttggccgact 



ggaggcacag 
agaacacaac 
ggtggctaac 
gaccattgtt 
tgctgacagc 
catcgaactt 
cgactataat 
ttttgtggcg 
cagttttgac 
agttcaattc 
gcggcaggcg 
tgcgtttatc 
cttctacaac 
cgcttatttt 
cgatgacgag 
gacgaaaagc 
cgcctacccc 
cggtatgatt 
cgccgctgcc 
ctacggaacg 
gaacacgaca 
catggcagcc 
cgacatttcg 
gtctgttcca 
cttccaagtg 
ctggtcgcta 
catcttctgg 



<210> 34 
<211> 604 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (1)...(23) 

<400> 34 

Met Val Arg Lys Lys 

Gly Ala Ser Phe Thr 
20 

Leu Ala Glu Arg Asn 
35 

His Leu Asn Asp Pro 
50 

Met Leu Thr Pro Glu 
65 

Gin Gly Gin Phe Asp 
85 

Glu Glu His Asn Met 
100 

Cys Thr Pro Gin Trp 
115 

lie Gly Leu Leu Arg 
130 

Gly Arg lie Pro lie 
145 

Gly Gly Thr Leu Arg 
165 

Tyr lie Glu Leu Ala 
180 

Leu Leu Phe Tyr Asn 
195 

Asp Ala Met Tyr Glu 
210 



an environmental sample 



Leu Phe Tyr lie 
ser Ala 
Asn Phe 



val His 
55 

Gin Gin 
70 

Phe Arg 



Gin Asp 

25 

Tyr val 
val Glu 



Ala Lys 
Ser Phe 



Ala lie His Gl 
Val Ala 



Asp Ser 
135 
Trp Asp 
150 

Asp Thr 

Phe Gin 

Asp Tyr 

Met val 
215 



Asn Gly 
120 

He Met 



Val Val 

Pro Trp 

Phe Ala 
185 
Asn Thr 
200 

Ser Asp 



Val Ala Leu Met 
10 

Ala Glu Phe Ser 

Gly Ala Ala val 
45 

Thr Leu Ala Arg 
60 

His cys Glu Leu 
75 

Asp Arg Leu val 
90 

His Ala Leu Val 

Glu Tyr Thr Arg 
125 

Thr lie Val Gly 
140 

Asn Glu Gly lie 
155 

Arg Gin Leu lie 
170 

His Glu Ala Asp 

Glu Gly Met Asn 
205 

Phe val Ala Arg 
220 
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Leu Met Phe 
15 

Leu Arg Gly 
30 

Tyr Thr Thr 

Glu Phe Asn 

Glu Ala Gin 
80 

Ala Phe Ala 
95 

Trp His ser 
110 

Asp Glu Ala 

Arg Tyr Lys 

Ala Asp Ser 
160 

Gly Asp Asp 

175 
Pro Asp Ala 
190 

Pro Lys Ser 
Gly He Pro 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1815 
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lie His Gly Val Gly Leu Gin ser His Phe lie Leu Gly Ser Phe Asp 
225 230 235 240 

pro Asp Gin lie Ala Arg Asn Val Ala Arg Leu Gly Glu Leu Gly Leu 

245 250 255 

Gin Val Gin Phe Thr Glu val Asp lie Arg Tyr ser Gly Glu Ala Thr 

260 265 " 270 

Asp Asn lie Leu Gin Arg Gin Ala Gly Asp Tyr His Arg Leu Met Asp 

275 280 285 

Val cys Leu Gly Asn Asp Ala cys Thr Ala Phe lie Thr Trp Gly Val 

290 295 300 

Thr Asp Lys Tyr Thr Trp Leu Arg Gly Ala Asn Leu Gly Phe Tyr Asn 
305 310 315 320 

Asn Leu Ser Val Glu Pro Leu Leu Phe Asp Asp Asp Tyr Glu Pro Lys 

325 330 335 

Pro Ala Tyr Phe Ala val Leu Asp Ser Leu Ala Arg Arg Ala Gly Glu 

340 345 350 

Thr Pro Val Leu Ser Asp Asp Glu Leu Ala Ala Met lie Gly Gly Thr 

355 360 365 

val Gin Thr val Glu lie Pro Pro Pro Thr Lys Ser Asn Leu Ser Gin 

370 375 380 

Glu Ala Pro Asp Ala val Pro Gly val lie Tyr Tyr Ala Ala Tyr Pro 
385 390 395 400 

lie ser lie Thr val Asp Gly Glu Ala Asn Asp Trp Glu Arg lie Pro 

405 410 415 

Arg Gly Met lie Asp ser Gly Pro Thr Val Pro Gin Asp Asn Asp Thr 

420 425 430 

Thr Met Thr Phe Ala Ala Ala Ala Asp Lys Thr Asn Leu Tyr Phe Leu 

435 440 445 

Ala Glu val Thr Asp ser Gin Val ser Tyr Gly Thr His Asp Pro Ala 

450 455 460 

Thr Ala Trp Tyr Gin Glu Asp Ser Val Glu Phe Tyr Leu Asn Thr Thr 
465 470 475 480 

Gly Asp Leu Thr Asn Thr Ala Tyr Gin Pro Gly Val Ala Gin lie Gly 

485 490 495 

He Met Ala Ala Asn lie Asp Asn Asp Asn Pro Gly Ala Pro lie He 

500 505 510 

Gly Gly Gly Asn ser Asp lie Ser Gin val Lys Ala lie Val Val Lys 

515 520 525 

Thr Asp Thr Gly Tyr Leu Val Glu Ala Ser Val Pro Leu Met Thr Asp 

530 535 540 

val Trp Thr lie Glu Pro Lys Gin Gly Ala Val Leu Gly Phe Gin Val 
545 550 555 560 

His Leu Asn Gly Ser Arg Thr Pro Asp Ala Asp Arg Asp Thr Lys Leu 

565 570 575 

lie Trp ser Leu Leu Asp Thr Leu Asp Gin Ser Tyr Ser Asn Pro ser 

580 585 590 

Leu Phe Gly Arg Leu lie Phe Trp Asn lie Asn Leu 
595 ~ 600 

<210> 35 
<211> 2286 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 35 

atgaccttga ttacgccaag ctcgaaatta accctcacta aagggaacaa aagctggagc 60 

tcgcgcgcct gcaggtcgac actagtggat ctcacgcttt acttcgaatc tcaaaatcca 120 

acccttgagt tctacgtgga tgacgtgaag atagtggata caacttccgc agagataaag 180 

attgaaatgg aacctgaaaa agagatacct gctctgaaag aagtactgaa agattacttc 240 

aaagtcggag ttgcactgcc gtccaaggtc ttcctcaacc cgaaggacat agaactcatc 300 

acgaaacact tcaacagcat caccgcagaa aacgagatga aaccggatag tctgctcgcg 360 

ggcatcgaaa acggtaagct gaagttcagg tttgaaacag cagacaaata cattcagttc 420 

gtcgaggaaa acggcatggt tataagaggt cacacactgg tgtggcacaa ccagacaccc 480 

gactggttct tcaaagacga aaacggaaac ctcctctcca aagaagcgat gacggaaaga 540 

ctcaaagagt acatccacac cgttgtcgga cacttcaaag gaaaagtcta cgcatgggac 600 

gtggtgaacg aagcggtcga tccgaaccag ccggatggac tgagaagatc aacctggtac 660 
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cagatcatgg 
gatgcaaaac 
tacaacctcg 
cacatcagtc 
atacccggta 
tccaactacc 
ctctttgaga 
aaggacgatt 
gatcaccagg 
ccaaaagaaa 
tacctgatgt 
agggcagtgt 
aaaccagcag 
ctgcagcctg 
gaagacgtac 
tcgatcacga 
gtgatagacg 
acgatgaact 
acaccggtca 
aaagcggtgg 
gacgagaact 
agcaacccgt 
ggatactacg 
ggaacgggag 
atagttgagg 
ttcaacatcc 
agcgatccca 
aagtga 



ggcctgacta 
tcttctacaa 
tgaaggatct 
ttgcaacaga 
tagaaattca 
cagaggcacc 
tcttcaagaa 
actcctggag 
cgaaactcgc 
gcaggatctc 
cgaagccgat 
ggaaagacag 
aagacggagt 
atgacaccta 
aggtgaagaa 
taccgggtgt 
acgggaagtg 
acggaacgct 
tcgatggaga 
ctatgggatc 
acctgtacgt 
gggagcagga 
aagacgacga 
gaagtccagc 
cagcgatcaa 
aggtgaacga 
caaacaacag 



catagaactc 
cgactacaac 
caaagagaag 
catcaaacag 
catcacagaa 
gaggacggca 
gcacagcaac 
agcaacaaga 
ttactgggcg 
cgaaggcgaa 
agagatcctt 
cacgatctac 
ggccatattc 
cgttgtgctg 
attcgttggg 
ggagttcaag 
gtacagctgg 
gaagctcgaa 
gatcgatgag 
gcttgacaag 
acttgcgatc 
ttccgtggag 
cgcgcagttc 
gaggttcaag 
gtggaagacg 
tgcgaacgag 
ctggcaagat 



gccttcaagt 
acattcgatc 
ggactcatcg 
atcgaagagg 
ctcgatatga 
ctcatcgaac 
gtgatcacga 
agaaacgact 
atagtggcac 
gcagtggtag 
gacgaagaag 
atctacggag 
atcaacccga 
tggacgaact 
cctggcttta 
aaagacagct 
agcgacacga 
ggaataatgg 
atctggaaca 
aatgcgacag 
gtgaaagatc 
atcttcgtgg 
agggtgaact 
acagcggtga 
atcaagccaa 
aaagggcaga 
ccttcaaagt 



tcgcaagaga 
ccagaaagag 
atggcatagg 
ccatcaaaaa 
gtgtctacag 
aggctcacaa 
acgtcacatt 
ggccgctcat 
ctgaggtcct 
tggggatgat 
ggaacgtgaa 
aggtacagga 
acaacgaaag 
ggaagacgga 
gaagatacag 
acataggatt 
cgaacagcca 
tagcgacagc 
cgacagagga 
cgaaagtgag 
ccgttctgaa 
atgagaacaa 
acatgaacga 
agctgatcga 
caccgaacac 
gggtcggtat 
tcggtaacct 



ggcagatcca 
agacatcatc 
aatgcagtgt 
gttcagcacc 
agattccagt 
aatgatgcag 
ctggggtctc 
cttcgacaaa 
tccaccactt 
ggacgactcg 
ggcaacgatc 
caagacaaag 
aacaccctat 
ggtcaacaga 
cttcgagatg 
tgacgttgcg 
gaagacgaac 
aaaatacgga 
gatagagacg 
ggtgctgtgg 
caaagacaac 
ccacaagaca 
gcagaccttt 
aggaggatac 
agtgatagga 
catctcctgg 
cagactcatc 



<210> 36 
<211> 761 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 
<400> 36 

Met Thr Leu lie Thr 
1 5 
Lys Ser Trp ser Ser 
20 

Leu Tyr Phe Glu Ser 
35 

val Lys lie val Asp 

pro Glu Lys Glu lie 
65 

Lys Val Gly Val Ala 
85 

lie Glu Leu He Thr 
100 

Met Lys Pro Asp ser 
115 

Phe Arg Phe Glu Thr 
130 

Gly Met Val lie Arg 
145 

Asp Trp Phe Phe Lys 
165 

Met Thr Glu Arg Leu 
180 

Lys Gly Lys val Tyr 
195 

Asn Gin Pro Asp Gly 
210 

Pro Asp Tyr lie Glu 
225 

Asp Ala Lys Leu Phe 



an environmental sample 



Pro ser Ser Lys 
Arg Ala 
Gin Asn 



Thr Thr 

55 
Pro Ala 
70 

Leu Pro 

Lys His 

Leu Leu 

Ala Asp 
135 
Gly His 
150 

Asp GlU 

Lys Glu 

Ala Trp 

Leu Arg 
215 
Leu Ala 

230 

Tyr Asn 



Cys Arg 

25 
Pro Thr 
40 

ser Ala 



Leu Lys 

Ser Lys 

Phe Asn 
105 
Ala Gly 
120 

Lys Tyr 

Thr Leu 

Asn Gly 

Tyr lie 
185 
Asp Val 
200 

Arg Ser 
Phe Lys 
Asp Tyr 



Leu Thr Leu 
10 

Ser Thr Leu 

Leu Glu Phe 

Glu lie Lys 
60 

Glu Val Leu 
75 

Val Phe Leu 
90 

ser lie Thr 

lie Glu Asn 

lie Gin Phe 
140 

Val Trp His 

155 
Asn Leu Leu 
170 

His Thr val 

Val Asn Glu 

Thr Trp Tyr 
220 

Phe Ala Arg 

235 
Asn Thr Phe 
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Thr Lys 

Val Asp 
30 

Tyr val 
45 

lie Glu 

Lys Asp 

Asn Pro 

Ala Glu 
110 
Gly Lys 
125 

Val Glu 

Asn Gin 

ser Lys 

val Gly 
190 
Ala Val 
205 

Gin lie 
Glu Ala 
Asp Pro 



Gly Asn 
15 

Leu Thr 

ASp Asp 

Met Glu 

Tyr Phe 

80 
Lys Asp 
95 

Asn Glu 

Leu Lys 

Glu Asn 

Thr Pro 
160 
Glu Ala 
175 

His Phe 

Asp Pro 

Met Gly 

Asp Pro 
240 
Arg Lys 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2286 
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245 250 , 255 

Arg Asp He He Tyr Asn Leu val Lys Asp Leu Lys Glu Lys Gly Leu 
260 265 270 

lie Gly Met Gin Cys His lie Ser 
280 

L>3 U1 „ Glu Glu Ala lie Lys Lys Phe ser 

290 295 300 

Glu lie His He Thr Glu Leu Asp Met ser val Tyr Arg Asp Ser ser 
305 310 315 320 



- " 2 ' 65 270 

He Asp Gly He Gly Met Gin Cys His lie Ser Leu Ala Thr Asp lie 

275 280 285 

Lys Gin He Glu Glu Ala He Lys Lys Phe Ser Thr He Pro Gly lie 

inn ?Q1 300 



Ser Asn Tyr Pro Glu Ala Pro Arg Thr Ala Leu lie Glu Gin Ala His 
325 330 335 

Leu Phe Glu lie Phe Lys Lys His ser Asn val ' 

345 350 
Phe Trp Gly Leu Lys Asp Asp Tyr ser Trp Arg 

360 365 . 

Asp Trp Pro Leu lie Phe Asp Lys Asp His Gin 

375 380 
Trp Ala lie val Ala Pro Glu Val Leu Pro Pro 

390 395 , , 

Arg lie ser Glu Gly Glu Ala val val val Gly 
405 410 „ 415 

Tvr Leu Met Ser Lys Pro lie Glu lie Leu Asp 

425 430 
Lys Ala Thr lie Arg Ala val Trp Lys Asp Ser 
" 435 440 445 

lie Tyr He Tyr Gly Glu val Gin Asp Lys Thr Lys Lys Pro Ala Glu 
450 . 455 460 



450 455 ^ DU 

Asp Gly val Ala He Phe He Asn Pro Asn Asn Glu Arg Thr Pro Tyr 
465 470 475 480 

Leu Gin Pro Asp Asp Thr Tyr Val val Leu Trp 

485 490 
Glu val Asn Arg Glu Asp val Gin val Lys Lys Phe val Gly Pro Gly 

500 505 510 

Phe Arg Arg Tyr Ser Phe Glu Met Ser He Thr lie Pro Gly val Glu 

515 520 525 

Phe Lys Lys Asp Ser Tyr lie Gly Phe Asp val Ala val He Asp Asp 

530 535 540 tU a 

Gly Lys Trp Tyr Ser Trp Ser Asp Thr Thr Asn " 
545 550 555 ^ 

Thr Met Asn Tyr Gly Thr Leu Lys Leu Glu Gly He Met Val Ala Thr 
565 570 _ 

_ _-■ _ . -l _ a ~„ /-I,, rl 1 1 Tla A cn Tl*> Tm 



565 57U 
Ala Lys Tyr Gly Thr Pro val lie Asp Gly Glu lie Asp Glu He Trp 

580 585 590 

Asn Thr Thr Glu Glu lie Glu Thr Lys Ala val Ala Met Gly Ser Leu 
595 600 605 

Asn Ala Thr Ala Lys val Arg val Leu Trp Asp " - *"* rw " 

615 , 620 

Val Leu Ala lie val Lys Asp Pro Val Leu Asn 

630 635 
pro Trp Glu Gin Asp Ser Val Glu lie Phe val 

645 650 
Lys Thr Gly Tyr Tyr Glu Asp Asp Asp Ala Gin 

660 665 
Met Asn Glu Gin Thr Phe Gly Thr Gly Gly ser 
675 680 685 

Thr Ala Val Lys Leu He Glu Gly Gly Tyr He 

695 700 
Lys Trp Lys Thr lie Lys Pro Thr Pro Asn Thr 
705 710 715 7^0 

Phe Asn He Gin Val Asn Asp Ala Asn Glu Lys Gly Gin Arg Val Gly 

725 730 735 

He lie Ser Trp Ser Asp Pro Thr Asn Asn ser Trp Gin Asp Pro ser 
740 745 750 



Lys Phe Gly Asn Leu Arg Leu lie Lys 
755 760 

<210> 37 
<213> 2769 
<212> DNA 
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<213> unknown 
<220> 

<223> Obtained from an environmental sample 
<400> 37 

atgcacaaga agaacggcac ctattacctg agctactcta caaatccggc caacggcatg 60 

cggattgact acatgaccag cacgagcccg accagtggct ttgtccatcg aggcacggtc 120 

atggcgcagc cctggcagaa cagcaacaac aacaaccacg caaccagcac cgagtacaac 180 

gggcagggct acatcttcta tcacaaccgt gcgttgtcga acgagcgtgc gggtggcaac 240 

gtgctgcagc gctcggtgaa cgtggatcgc ctctacttca atgccgatgg cagcatccgt 300 

caggtcactt ccagtgcaac cggcgtgccg gccctgaaaa ccctggatgc cttcctggtc 360 

aagcctgccg agctgtatca caaggaaagc gggatcaaga ccgagcctgc cagtgaagga 420 

acccaggcac tggttatgac ggctggtagc tgggtgcgcc tggccaatgt cgatttcggc 480 

aatggcggcg ccactggttt ttccgcgcgt attgcggcaa ccggcagcgg cagcatccag 540 

gtgatcctgg gcaatctgaa caacgccccg gtcggcacgc tggcagtgag cagcaccggc 600 

aacctccaga cctggcaaga ccgcagcacc gccatcagca aggtgaccgg cgtgcatgac 660 

gtgtatttgc gtgccaccgg caatgtgcat gtgcagcgtc actggttcgt ggcgtcggcg 720 

ccggccgctg ccgcctcatc cagcagtcag gcaagcgtct ctgccagcag tcaggcaagt 780 

gtttcttcca gtagccaggc aagcgttgcc tccagcagca gttccagccg cgcttcttcc 840 

gccagcagtt cggtggcggc tggccaggtg gaggtcggtt atcgccttag cagcgaatgg 900 

gccgccggct tctgcggcgt ggtcaccatc cgtaatccgg gtagttctcc ggtcaccagc 960 

tggagtggca gtttcaacct gcctggcggc aagatcaccc agctgtggaa tgccaactgg 1020 

acccagaacg gcagcaccgt gacggtatct tcccaggcct ggagcggtgc cattgctgca 1080 

ggcgccacca tcaccacgcc gggcttctgc gccgagcgca cgagcagcaa tgcgtcctcc 1140 

agtgtcgcca gcagcagtgt ctcctcatcg agcagcagtg ctgcggctgc cagctccagc 1200 

gcggcttcca gcgtcccgtc cactggcagt ggcggggtgg gcagcagcgc atcctcggct 1260 

agcagtgccg ctgcacccaa gggcgtgctt gaagtcggcc tgagcggcct ttccagccag 1320 

gccatgttcg ccccgttgcg ggtcaggacg gacgctgcgg ccgccaacaa ggcctatgtt 1380 

gaatggccca acaacggcgc caatcagtcg ctggcaacgc ctgccaacga tgccgcaggg 1440 

caggtggagg tagccttcgt gctggcccag gcatccgcag tgcagtttga tatcgaagcg 1500 

aatttcgcca acgcggaaga cgactccttc tacttccagc tcaacggtgg tgcctggcag 1560 

accttcaaca acgccaccac ggtcggctgg cagaccctgc cggtcgcctc tctgggcaat 1620 

ctggctgccg ggcgccatgt gctgaccctg ctgcgccgcg aggatggcgc gaagctgggc 1680 

aaggtcgtcc tgagtgcggc acagagcagc atcagtcgtg ccacgccggt ggcctacgcg 1740 

tcgccgaatg atgttgccaa cctgttcaag ctggccagct tcccgatcgg ggtggcggtc 1800 

agtgccggca acgaaggtga cagcctgctg cgtagcggta cccgcgcagc agccgagcgt 1860 

gcgctgaccg agaagcactt caacagtctg gtggccggca acatcatgaa gatgagctac 1920 

ctgcatccgg ccgagaacac ctacaccttc acccaggcgg atgcgctggc cgactacgcc 1980 

aagtccaagg gcatggtgtt gcatggccat gcgctggtct ggcatgcgga ctatcaggta 2040 

cccaactgga tgaagaatta caccggagac tggtcgaaga tgctcgaagc ccacgtcacc 2100 

accgtcgcca agcactatgc cggcaaggtg gtgagctggg atgtggtgaa tgaagccctg 2160 

gccgatggca atgccaccgc caccaagggt ttccgtgcca ccgattcgat cttctatcag 2220 

aagatgggct ccagtttcat cgagaaggcc tttattgctg cacgtgctgc cgacccgaat 2280 

gccgacctgt attacaacga ctacggcatg gagggcggaa acagcaagtt caattactgc 2340 

atggccatgg tcgatgattt ccagaagcgt ggcattccca tcgacggcat cggtttccag 2400 

atgcacatca acatcgactg gccttcgtcg gcccagatcc gcgctgtatt cagtgaagtg 2460 

gtcaagcgtg gtctgaaggt gcgtatctcc gagctggata ttccggtgaa taccactgcc 2520 

ggtcgttttg ccagcctgaa tgccacggcc aacgagctgc agaagaagaa gtatcgtgag 2580 

gttgtggctg cctacctgga tgtggtgccg cccgagctgc gcggtggcat caccgtgtgg 2640 

ggcctgagcg acaacggcag ctggctggtg acccccacca agccggactg gccgctgctg 2700 

ttcgatgccg acctcaaggc caaggacgcc ctgagcggct ttgccgacgc cctgcgcggc 2760 

gtacgctga 2769 

<210> 38 
<211> 922 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 38 

Met His Lys Lys Asn Gly Thr Tyr Tyr Leu Ser Tyr ser Thr Asn Pro 

Ala Asn Gly Met Arg lie Asp Tyr Met Thr ser Thr Ser Pro Thr Ser 

20 25 30 

Gly Phe Val His Arg Gly Thr Val Met Ala Gin Pro Trp Gin Asn Ser 
35 40 45 
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Asn 


Asn 


Asn 


Asn 


His 


Ala 


Thr 


Ser 


Thr 


Glu 


Tyr 


Asn 


Gly 


Gin 


Gly 


Tyr 




50 










55 








60 




Gly 


Gly 




lie 


Phe 


Tyr 


His 


Asn 


Arg 


Ala 


Leu 


Ser 


Asn 


Glu 


Arg 


Ala 


Asn 


65 








70 










75 






Ala 


80 


Val 


Leu 


Gin 


Arg 


Ser 


val 


Asn 


val 


Asp 


Arg 


Leu 


Tyr 


Phe 


Asn 


ASp 








85 








90 








95 




Gly 


Ser 


lie 


Arg 


Gin 


val 


Thr 


Ser 


Ser 


Ala 


Thr 


Gly 


Val 


Pro 


Ala 


Leu 






100 










105 








110 






Lys 


Thr 


Leu 


ASP 


Ala 


Phe 


Leu 


Val 


Lys 


Pro 


Ala 


Glu 


Leu 


Tyr 


His 


Lys 




115 








120 








125 




Ala 




Glu 


Ser 


Glv 

vi i y 


He 


LVS 


Thr 


Glu 


Pro 


Ala 


Ser 


Glu 


Gly 


Thr 


Gin 


Leu 




130 






135 










140 






Phe 




Val 


Met 


Thr 


Ala 


GlV 

vj i y 


ser 


Tro 


val 


Ara 


Leu 


Ala 


Asn 


val 


Asp 


Gly 


145 








150 






155 




Ala 


Thr 


Gly 


160 


Asn 


Glv 

\j i y 


Glv 

« i y 


Ala 


Thr 


Glv 


Phe 


Ser 


Ala 


Ara 


lie 


Ala 


Ser 






165 








170 










175 


Gly 


Gl V 
vj i y 


Ser 


lie 


Gin 


Val 


lie 


Leu 


Gly 

VJ 1 J 


Asn 


Leu 


Asn 


Asn 


Ala 


Pro 


Val 






180 








185 










190 






Thr 


Leu 


Ala 
195 


Val 


Ser 


Ser 


Thr 


Gly 
200 


Asn 


Leu 


Gin 


Thr 


Trp 

205 


Gin 


Asp 


Arg 


ser 


Thr 


Ala 


lie 


ser 


LVS 


val 


Thr 


Glv 


Val 


His 


ASD 


val 


Tyr 


Leu 


Arg 




210 








215 








220 


val 


Ala 




Ala 


Ala 


Thr 


Glv 


Asn 


val 


His 


val 


Gin 


Ara 


His 


Tro 


Phe 


Ser 


225 








230 








235 










240 


Pro 


Ala 


Ala 


Ala 


Ala 


Ser 


Ser 


Ser 


ser 


Gin 


Ala 


Ser 


Val 


Ser 


Ala 


ser 










245 










250 






val 


Ala 


255 




Ser 


Gin 


Ala 


Ser 


val 


Ser 


Ser 


Ser 


Ser 


Gin 


Ala 


ser 


ser 


ser 








260 










265 








val 


270 


Ala 


Gly 


Ser 


Ser 


ser 


Ser 


Arq 


Ala 


Ser 


Ser 


Ala 


Ser 


Ser 


ser 


Ala 






275 








280 










285 






Gin 


Val 


Glu 


Val 


Gly 


Tyr 


Arg 


Leu 


Ser 


Ser 


Glu 


Trp 


Ala 


Ala 


Gly 


Phe 




290 






295 










300 










cys 


Gly 


Val 


val 


Thr 


He 


Arg 


Asn 


Pro 


Gly 


Ser 


Ser 


Pro 


Val 


Thr 


ser 


305 








310 






315 










320 




Ser 


Glv 

VI 1 J 


ser 


Phe 


Asn 


Leu 


Pro 


Gly 


Gly 


Lys 


He 


Thr 


Gin 


Leu 


Trp 






325 








330 








335 




Asn 


Ala 


Asn 


Tro 


Thr 


Gin 


Asn 


Gly 


Ser 


Thr 


Val 


Thr 


val 


Ser 


Ser 


Gin 








340 








345 










350 






Ala 


Tro 


Ser 


Gly 


Ala 


lie 


Ala 


Ala 


Gly 


Ala 


Thr 


He 


Thr 


Thr 


Pro 


Gly 




355 








360 








365 








Phe 


Cvs 


Ala 


Glu 


Arg 


Thr 


Ser 


ser 


Asn 


Ala 


Ser 


Ser 


Ser 


Val 


Ala 


Ser 




370 








375 










380 


Ala 








ser 


ser 


val 


Ser 


Ser 


ser 


ser 


ser 


ser 


Ala 


Ala 


Ala 


ser 


ser 


ser 


385 










390 










395 










400 


Ala 


Ala 


Ser 


Ser 


Val 


Pro 


Ser 


Thr 


Glv 


Ser 


Gly 

va i jr 


Gly 


val 


Gly 


Ser 


Ser 










405 








410 




415 


Val 


Ala 


ser 


Ser 


Ala 


Ser 


Ser 


Ala 


Ala 


Ala 


Pro 


Lys 


Gly 


Val 


Leu 


Glu 








420 










425 






430 




Val 


Glv 


Leu 


Ser 


Glv 

vi i y 


Leu 


Ser 


Ser 


Gin 


Ala 


Met 


Phe 


Ala 


Pro 


Leu 


Arg 




435 








440 










445 








Arg 


Thr 


ASP 


Ala 


Ala 


Ala 


Ala 


Asn 


Lys 


Ala 


Tyr 


val 


Glu 


Trp 


pro 


Asn 


450 








455 








460 




Ala 


Ala 


Gly 


Asn 


Gly 


Ala 


Asn 


Gin 


ser 


Leu 


Ala 


Thr 


Pro 


Ala 


Asn 


ASp 


465 








470 










475 








480 


Gin 


val 


Glu 


Val 


Ala 


Phe 


Val 


Leu 


Ala 


Gin 


Ala 


Ser 


Ala 


Val 


Gin 


Phe 










485 










490 








Phe 


495 




Asp 


He 


Glu 


Ala 


Asn 


Phe 


Ala 


Asn 


Ala 


Glu 


Asp 


Asp 


ser 


Tyr 


Phe 






500 










505 






510 


Thr 


val 


Gin 


Leu 


Asn 


GlV 
vi i y 


Glv 

vj i y 


Ala 


Tro 


Gin 


Thr 


Phe 


Asn 


Asn 


Ala 


Thr 






515 




520 










525 


Ala 


Ala 


Gly 


Gly 


Trp 


Gin 


Thr 


Leu 


pro 


val 


Ala 


ser 


Leu 


Gly 


Asn 


Leu 


530 










535 










540 


Ala 






Gly 


Arg 


His 


val 


Leu 


Thr 


Leu 


Leu 


Arg 


Arg 


Glu 


Asp 


Gly 


Lys 


Leu 


545 










550 






555 






Ala 


Thr 


560 


Lys 


val 


val 


Leu 


Ser 


Ala 


Ala 


Gin 


ser 


ser 


He 


Ser 


Arg 


Pro 








565 










570 








575 


Ala 


val 


Ala 


Tyr 


Ala 


ser 


pro 


Asn 


Asp 


val 


Ala 


Asn 


Leu 


Phe 


Lys 


Leu 






580 








585 










590 






Ser 


Phe 


pro 


He 


Gly 


val 


Ala 


val 


ser 


Ala 


Gly 


Asn 


Glu 


Gly 


Asp 


Ser 
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595 600 605 

Leu Leu Arg Ser Gly Thr Arg Ala Ala Ala Glu Arg Ala Leu Thr Glu 

610 615 620 

Lys His Phe Asn Ser Leu val Ala Gly Asn He Met Lys Met Ser Tyr 
625 630 635 640 

Leu His Pro Ala Glu Asn Thr Tyr Thr Phe Thr Gin Ala Asp Ala Leu 

645 650 655 

Ala Asp Tyr Ala Lys ser Lys Gly Met Val Leu His Gly His Ala Leu 

660 665 670 

val Trp His Ala Asp Tyr Gin val Pro Asn Trp Met Lys Asn Tyr Thr 

675 680 685 

Gly Asp Trp Ser Lys Met Leu Glu Ala His Val Thr Thr Val Ala Lys 

690 695 700 

His Tyr Ala Gly Lys Val Val Ser Trp Asp Val Val Asn Glu Ala Leu 
705 710 715 720 

Ala Asp Gly Asn Ala Thr Ala Thr Lys Gly Phe Arg Ala Thr Asp ser 

725 730 735 

lie Phe Tyr Gin Lys Met Gly Ser ser Phe lie Glu Lys Ala Phe lie 

740 745 750 

Ala Ala Arg Ala Ala Asp Pro Asn Ala Asp Leu Tyr Tyr Asn Asp Tyr 

755 760 765 

Gly Met Glu Gly Gly Asn Ser Lys Phe Asn Tyr cys Met Ala Met val 

770 775 780 

Asp Asp Phe Gin Lys Arg Gly lie Pro lie Asp Gly lie Gly Phe Gin 
785 790 795 800 

Met His lie Asn lie Asp Trp Pro Ser Ser Ala Gin lie Arg Ala Val 

805 810 815 

Phe Ser Glu val Val Lys Arg Gly Leu Lys val Arg lie Ser Glu Leu 

820 ~ 825 830 

Asp lie Pro val Asn Thr Thr Ala Gly Arg Phe Ala ser Leu Asn Ala 

835 840 ~ 845 

Thr Ala Asn Glu Leu Gin Lys Lys Lys Tyr Arg Glu val val Ala Ala 

850 855 860 

Tyr Leu Asp val val pro Pro Glu Leu Arg Gly Gly lie Thr Val Trp 
865 870 875 880 

Gly Leu Ser Asp Asn Gly Ser Trp Leu Val Thr Pro Thr Lys Pro Asp 

885 890 895 

Trp Pro Leu Leu Phe Asp Ala Asp Leu Lys Ala Lys Asp Ala Leu ser 

900 905 910 

Gly Phe Ala Asp Ala Leu Arg Gly Val Arg 
915 920 

<210> 39 
<211> 1143 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 39 

atgaaaaaaa cgattgcaca tttcacctta tggatagtgt tttttctctt cacttcctgt 60 

gctgttacgg cgcagaagaa tgcaaagaat acaagagtaa aacccactac cctaaaagag 120 

gcttaccaag gtaaattcta tatcggtact gcgatgaact tgagacagat tcacggagat 180 

gatccccaat ctgaaaatat tatcaaaaaa cagttcaatt ccatagttgc cgaaaactgc 240 

atgaagagta tgtatcttca gccggaggaa ggaaaatttt tcttcgatga tgcggacaag 300 

tttgtggatt ttggtcttca gaacaatatg ttcatcattg ggcattgtct gatttggcat 360 

tcgcaggcgc caaaatggtt tttcaccgat gagaatggaa acacggtttc tccagaagtt 420 

cttaaacaaa ggatgaaagc ccatattacc gccgtcgttt cccgttacaa agggaaaatc 480 

aaaggttggg atgtggtgaa cgaagccatt atggaagatg gttcttaccg taaaagcaaa 540 

ttttacgaga ttttgggaga agaatttatt ccgttggcat ttcagtatgc gcatgaagca 600 

gatcctgatg cagaacttta ttacaacgat tataacgaat ggtatcccgg aaaaagagct 660 

acggtgacca agataatccg cgatttcaaa actagaggaa tccgcatcga tgccatcgga 720 

atgcaggctc atttcgggat ggattcgccc actgtagaag agtatgaaca aactattcag 780 

ggctatataa aagaaggcgt gaaagtcaat attacggaac tcgatttgag tccacttcct 840 

tctccttggg gaacttccgc caatgttgcc gatacgcagc aatatcagga aaaaatgaat 900 

ccatacacca aaggacttcc tgcagatgtt gaaaaagcat gggaaaaccg ttatgtggat 960 

tttttcaaac tgttcctaaa atatcatcag catattgagc gtgttacgtt ttggggcgtt 1020 

agcgatatcg attcctggaa gaacgatttt ccggtaagag gacgtaccga ttatccacta 1080 
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ccgtttaacc gtcaatatca agcaaaacct ttggttcaga aattaataga tttaacaaaa 1140 
tag ~ ~ - - 1143 

<210> 40 
<211> 380 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...C24) 

<400> 40 

Met Lys Lys Thr lie Ala His Phe Thr Leu Trp lie Val phe Phe Leu 

15 10 15 

Phe Thr Ser cys Ala Val Thr Ala Gin Lys Asn Ala Lys Asn Thr Arg 

20 25 30 

val Lys Pro Thr Thr Leu Lys Glu Ala Tyr Gin Gly Lys Phe Tyr lie 

35 40 45 

Gly Thr Ala Met Asn Leu Arg Gin lie His Gly Asp Asp Pro Gin ser 

50 55 60 

Glu Asn lie lie Lys Lys Gin Phe Asn ser lie val Ala Glu Asn cys 
65 70 75 80 

Met Lys Ser Met Tyr Leu Gin Pro Glu Glu Gly Lys Phe Phe Phe Asp 

85 90 95 

Asp Ala Asp Lys Phe val Asp Phe Gly Leu Gin Asn Asn Met Phe lie 

100 105 110 

lie Gly His Cys Leu lie Trp His Ser Gin Ala Pro Lys Trp Phe Phe 

115 120 125 

Thr Asp Glu Asn Gly Asn Thr Val Ser Pro Glu Val Leu Lys Gin Arg 

130 135 140 

Met Lys Ala His He Thr Ala Val Val Ser Arg Tyr Lys Gly Lys lie 
145 150 155 160 

Lys Gly Trp Asp val val Asn Glu Ala lie Met Glu Asp Gly ser Tyr 

165 170 175 

Arg Lys Ser Lys Phe Tyr Glu lie Leu Gly Glu Glu Phe lie Pro Leu 

180 185 190 

Ala Phe Gin Tyr Ala His Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr 

195 200 205 

Asn Asp Tyr Asn Glu Trp Tyr pro Gly Lys Arg Ala Thr val Thr Lys 

210 215 220 

lie lie Arg Asp Phe Lys Thr Arg Gly lie Arg lie Asp Ala lie Gly 
225 230 235 240 

Met Gin Ala His Phe Gly Met Asp Ser pro Thr Val Glu Glu Tyr Glu 

245 250 255 

Gin Thr lie Gin Gly Tyr lie Lys Glu Gly Val Lys Val Asn He Thr 

260 265 270 

Glu Leu Asp Leu ser Pro Leu Pro Ser Pro Trp Gly Thr ser Ala Asn 

275 280 285 

Val Ala Asp Thr Gin Gin Tyr Gin Glu Lys Met Asn Pro Tyr Thr Lys 

290 295 300 

Gly Leu Pro Ala Asp Val Glu Lys Ala Trp Glu Asn Arg Tyr val Asp 
305 310 315 320 

Phe phe Lys Leu Phe Leu Lys Tyr His Gin His lie Glu Arg Val Thr 

325 330 335 

Phe Trp Gly val ser Asp He Asp Ser Trp Lys Asn Asp Phe Pro Val 

340 345 350 

Arg Gly Arg Thr Asp Tyr Pro Leu Pro Phe Asn Arg Gin Tyr Gin Ala 

355 360 365 

Lys Pro Leu Val Gin Lys Leu lie Asp Leu Thr Lys 
370 375 380 

<210> 41 
<211> 1893 
<212> DNIA 
<213> Unknown 
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<220> 

<223> Obtained from an environmental sample 



<400> 41 

atgatccatc 

tttgtcggta 

tacaacattg 

actggtgcaa 

aatggttgga 

aacggtagca 

aataccgttg 

gttcgctcca 

cctgcttcca 

tcagtaccgg 

ggtacgactg 

gccttaatta 

accaacggca 

agcgtactga 

tacacacgcg 

tacacccaat 

gccagttatt 

agctcacgca 

tcgggtgcag 

ggtaagtggg 

atttatgctt 

gcgcaatcac 

caatggattc 

gaagcggttc 

atccaacgcg 

gattacaaca 

aattatattg 

gtcaaaaccg 

tacgacattg 

gtattctggg 

acttggattg 

tggttgctga 



aacaaaagcc 
ttagcgcggc 
ataaccaatg 
ccgtcaataa 
gtgcaaattt 
ttgccgctgg 
agcgcccggt 
gcgtggctgc 
gcactccgcg 
ccaataattt 
cgggcaccgt 
ccggccgcac 
accagtacca 
ccttaaccgg 
tagcgactgt 
ctggcagcac 
acgcggatga 
gctcaagcag 
tgagatccga 
gttccgttga 
acgctcgcca 
ccgcgtggct 
gcgattactg 
ctggccatca 
tgttccaatt 
atatccgttg 
atgcagtcgg 
caatcgacaa 
gcgataacaa 
accatccgca 
aaggctcggg 
ataactatat 



caaccaagac 
actggctgtt 
gggcagcggt 
ctggagtgtg 
ctctggcagc 
ccagtcggtg 
ggttaacggt 
gacgtcttcc 
ttcaagcaca 
tgcgcagaat 
gactcgctct 
tgctgcctgg 
agtcaacgtg 
taagcgtgta 
gactgcctct 
tgcattccag 
tttcgccatc 
tgctccggcg 
ctttactcgt 
aggtactcgc 
aaataatatt 
caataactta 
tactcgttac 
accggcaggt 
ggctcgccaa 
gcagcacaat 
cctgcaggcg 
tatttggaac 
tgaccaggtt 
tgttaaaggc 
cctgatttct 
caataagcag 



atcggtaggc 
ttctcacaca 
tttgtcgcta 
aattggcaat 
aatccttaca 
acttttggtt 
tcactgtgcg 
agtcgctcca 
cctgccacct 
ggcggcgtgg 
actgccgata 
aatggtttga 
tgggtgaaat 
gacgatagcg 
gccaatgagt 
catttcatta 
ggcggtcaag 
gctagaaaat 
tactggaacc 
aaccagtaca 
ccggtaaaag 
agcggaccgg 
cctgacacgg 
tatgcacaac 
tattgcccta 
gagtttattg 
catgaactga 
caagtgggca 
caattgcaga 
atcaccattt 
gacaacggaa 
taa 



tattcaagcg 
ccgcaagtgc 
gtattactgt 
atgccaacaa 
ccgccaccaa 
tccagggcaa 
gtactgcaac 
gtgttgcgcc 
cttcttctgc 
aatctggttt 
aacacagcgg 
cgtttaatgt 
tggctccagg 
atactactac 
ggcgtttgct 
tcgaagcaac 
tcgtacaagt 
tcatcggcaa 
aaattacacc 
actgggcacc 
ctcacacgtt 
aagtcgctgt 
cgatgattga 
gagcatttgg 
actcgatcct 
cccttgcaaa 
agggtatgac 
agcccatcta 
atttccaggc 
ggggttatgt 
caccgcgccc 



cagctgcagc 

agcctgtact 

aaagaatgac 

tcgcatcacc 

tatgagctgg 

cactaacagc 

aacctcttca 

cagctcgatt 

ttccagcttc 

gaccaactgg 

tacagccagt 

gggcgcattg 

tacgcccgac 

ctacaacgaa 

ggaaggttac 

ggatactact 

tccaagcagc 

catcaccacc 

agagaacgaa 

gctggatcgt 

tgtgtggggt 

tgaaattgaa 

cgtagtgaac ' 

caataactgg 

gatcctgaat 

agctcaaggc 

agcggcgcaa 

catttctgaa 

gcatttccct 

caatggcaga 

cgcaatgact 



<210> 42 
<211> 630 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> CD... (37) 

<400> 42 

Met lie His Gin Gin 

1 5 
Arg Ser Cys Ser Phe 
20 

His Thr Ala ser Ala 
35 

Ser Gly Phe Val Ala 
50 

Val Asn Asn Trp Ser 
65 

Asn Gly Trp Ser Ala 
85 

Asn Met Ser Trp Asn 
100 

Gly Phe Gin Gly Asn 
115 

Asn Gly Ser Leu Cys 
130 

Val Ala Ala Thr ser 
145 



an environmental sample 



Lys Pro Asn Gin 
val Gly 
Ala Cys 



ser lie 

55 
Val Asn 
70 

Asn Phe 

Gly Ser 

Thr Asn 

Gly Thr 
135 
Ser ser 
150 



lie ser 

25 
Thr Tyr 
40 
Thr Val 



Trp Gin 

ser Gly 

lie Ala 
105 
ser Asn 
120 

Ala Thr 
Arg ser 



Asp lie Gly 
10 

Ala Ala Leu 

Asn lie Asp 

Lys Asn Asp 
60 

Tyr Ala Asn 
75 

ser Asn Pro 
90 

Ala Gly Gin 

Thr val Glu 

Thr ser Ser 
140 

ser val Ala 
155 
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Arg Leu 

Ala val 
30 
Asn Gin 
45 

Thr Gly 

Asn Arg 

Tyr Thr 

ser val 
110 
Arg Pro 
125 

Val Arg 
pro ser 



Phe Lys 
15 

Phe Ser 

Trp Gly 

Ala Thr 

lie Thr 
80 

Ala Thr 
95 

Thr Phe 

val Val 

Ser ser 

Ser lie 
160 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1893 
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Pro Ala ser ser Thr Pro Arg ser ser Thr Pro Ala Thr ser ser ser 

165 " 170 175 

Ala Ser ser Phe ser Val Pro Ala Asn Asn Phe Ala Gin Asn Gly Gly 

180 185 190 

Val Glu Ser Gly Leu Thr Asn Trp Gly Thr Thr Ala Gly Thr Val Thr 

195 200 205 

Arg ser Thr Ala Asp Lys His Ser Gly Thr Ala ser Ala Leu lie Thr 

210 215 220 

Gly Arg Thr Ala Ala Trp Asn Gly Leu Thr Phe Asn val Gly Ala Leu 
225 230 235 240 

Thr Asn Gly Asn Gin Tyr Gin Val Asn Val Trp Val Lys Leu Ala Pro 

245 250 255 

Gly Thr Pro Asp Ser val Leu Thr Leu Thr Gly Lys Arg val Asp Asp 

260 265 270 

Ser Asp Thr Thr Thr Tyr Asn Glu Tyr Thr Arg Val Ala Thr Val Thr 

275 280 285 

Ala ser Ala Asn Glu Trp Arg Leu Leu Glu Gly Tyr Tyr Thr Gin ser 

290 295 300 

Gly Ser Thr Ala Phe Gin His Phe lie lie Glu Ala Thr Asp Thr Thr 
305 310 315 320 

Ala Ser Tyr Tyr Ala Asp Asp Phe Ala lie Gly Gly Gin val Val Gin 

325 330 335 

Val Pro Ser ser ser ser Arg Ser ser Ser Ser Ala Pro Ala Ala Arg 

340 345 350 

Lys Phe lie Gly Asn lie Thr Thr ser Gly Ala Val Arg ser Asp Phe 

355 360 365 

Thr Arg Tyr Trp Asn Gin lie Thr Pro Glu Asn Glu Gly Lys Trp Gly 

370 375 380 

Ser val Glu Gly Thr Arg Asn Gin Tyr Asn Trp Ala Pro Leu Asp Arg 
385 390 395 400 

lie Tyr Ala Tyr Ala Arg Gin Asn Asn lie Pro Val Lys Ala His Thr 

405 410 415 

Phe val Trp Gly Ala Gin Ser pro Ala Trp Leu Asn Asn Leu Ser Gly 

420 425 430 

Pro Glu val Ala val Glu lie Glu Gin Trp He Arg Asp Tyr Cys Thr 

435 440 445 

Arg Tyr Pro Asp Thr Ala Met lie Asp val Val Asn Glu Ala Val Pro 

450 455 460 

Gly His Gin Pro Ala Gly Tyr Ala Gin Arg Ala Phe Gly Asn Asn Trp 
465 470 475 480 

lie Gin Arg val Phe Gin Leu Ala Arg Gin Tyr cys Pro Asn Ser lie 

485 490 495 

Leu lie Leu Asn Asp Tyr Asn Asn lie Arg Trp Gin His Asn Glu Phe 

500 505 " 510 

He Ala Leu Ala Lys Ala Gin Gly Asn Tyr He Asp Ala val Gly Leu 

515 520 525 

Gin Ala His Glu Leu Lys Gly Met Thr Ala Ala Gin val Lys Thr Ala 

530 535 540 

lie Asp Asn lie Trp Asn Gin Val Gly Lys Pro lie Tyr lie ser Glu 
545 550 555 560 

Tyr Asp lie Gly Asp Asn Asn Asp Gin Val Gin Leu Gin Asn Phe Gin 

565 570 575 

Ala His Phe Pro val Phe Trp Asp His Pro His Val Lys Gly lie Thr 

580 585 590 

lie Trp Gly Tyr val Asn Gly Arg Thr Trp lie Glu Gly ser Gly Leu 

595 600 605 

lie ser Asp Asn Gly Thr Pro Arg Pro Ala Met Thr Trp Leu Leu Asn 

610 615 620 

Asn Tyr He Asn Lys Gin 
625 630 

<210> 43 
<211> 1011 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
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atgcaaacaa atattaaagg aaataacatt ccatcattac acgaagttta tcaagatcac 60 

tttttgatag gtgcagcagt taatccaaaa acattagact cacagcagga tttattgaga 120 

aaacacttta acagtattac agctgaaaat gaaatgaaat ttgaagaatt gcaaccagaa 180 

cctggccatt tcacgtttgg tgtagcagat gaaatcgttt catttgcaaa agaaaatgga 240 

atgaaagtta gaggacatac attagtttgg cataatcaaa cgcctgattg gatgtttttg 300 

aatgaagatg gatctgtcac agatcgagaa acgcttctag aaagaatgaa attacacatt 360 

acaacagtta tgcagcatta caaaggtcaa gcttattgct gggatgttgt aaatgaggtg 420 

attgctgacg agggtacaga gttattccgt aaatctaaat ggactgaaat tattggtgat 480 

gattttgtag aaaaggcatt tgaatatgca catgaggctg atccagaagc tttactattc 540 

tacaatgact ataatgaatc ccatcccaat aagcgtgaga aaattttcac acttgtaaaa 600 

ggattagttg ataaggggat acctattcat ggaatcggtt tacaagcaca ttggaattta 660 

acaggacctt cttatgaaga tattagagca gcactcgaga aatatgctac attgggattg. 720 

gaaatacacc ttaccgaatt ggatgtttct gtttttaatt atgaagatcg aagaacagat 780 

ttaacagaac caactaaaga tatgcaagcg cttcaagcgg agcgttatac agaattattc 840 

aagatattga gagaatatag tcatgtaatc agttcgatta ctttttgggg agctgcagat 900 

gattatactt ggttagatga ttttcctgtc aaaggaagaa aaaactggcc atttgttttt 960 

gatgaaaacc aagagccaaa agagtcattt tggaatatta ttgactttta a 1011 

<210> 44 
<211> 336 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

5et°Gln 4 Thr Asn He Lys Gly Asn Asn lie Pro ser Leu His Glu Val 

15 10 15 

Tyr Gin Asp His Phe Leu lie Gly Ala Ala val Asn Pro Lys Thr Leu 

20 25 30 

Asp Ser Gin Gin Asp Leu Leu Arg Lys His Phe Asn ser lie Thr Ala 

35 40 45 

Glu Asn Glu Met Lys Phe Glu Glu Leu Gin Pro Glu Pro Gly His Phe 

50 55 60 

Thr Phe Gly val Ala Asp Glu He Val Ser Phe Ala Lys Glu Asn Gly 
65 70 75 80 

Met Lys Val Arg Gly His Thr Leu val Trp His Asn Gin Thr Pro Asp 

85 90 , 95 

Trp Met Phe Leu Asn Glu Asp Gly Ser val Thr Asp Arg Glu Thr Leu 

100 105 , 110 

Leu Glu Arg Met Lys Leu His lie Thr Thr Val Met Gin His Tyr Lys 

115 120 125 

Gly Gin Ala Tyr Cys Trp Asp Val val Asn Glu val He Ala Asp Glu 

130 135 140 

Gly Thr Glu Leu Phe Arg Lys ser Lys Trp Thr Glu lie He Gly Asp 
145 150 155 160 

Asp Phe val Glu Lys Ala Phe Glu Tyr Ala His Glu Ala Asp Pro Glu 

165 170 175 

Ala Leu Leu Phe Tyr Asn Asp Tyr Asn Glu Ser His Pro Asn Lys Arg 

180 185 190 

Glu Lys lie Phe Thr Leu val Lys Gly Leu val Asp Lys Gly lie Pro 

195 200 205 

He His Gly lie Gly Leu Gin Ala His Trp Asn Leu Thr Gly Pro Ser 

210 215 220 

Tyr Glu Asp lie Arg Ala Ala Leu Glu Lys Tyr Ala Thr Leu Gly Leu 
225 230 235 240 

Glu lie His Leu Thr Glu Leu Asp Val Ser Val Phe Asn Tyr Glu Asp 

245 250 255 

Arg Arg Thr Asp Leu Thr Glu Pro Thr Lys Asp Met Gin Ala Leu Gin 

* 260 265 270 

Ala Glu Arg Tyr Thr Glu Leu Phe Lys He Leu Arg Glu Tyr ser His 

val lie ser ser lie Thr Phe Trp Gly Ala Ala Asp Asp Tyr Thr Trp 

290 295 300 

Leu Asp Asp Phe Pro val Lys Gly Arg Lys Asn Trp Pro Phe val phe 
305 310 315 320 

Asp Glu Asn Gin Glu Pro Lys Glu ser Phe Trp Asn He He Asp phe 
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<210> 45 
<211> 1137 
<212> DNA 
<213> Unknown 

<223> Obtained from an environmental sample 

atgaagatat cacgccgaca attacttgct atgggtggtg ccgctgcaac cctggcctct 60 

gccaaattat tcgctgccga aaaagctgct gccgccaccg gattgaaaga tgcctataaa 120 

lacgatttcc tgltcggtgc tgcattaaat acccaaattg ttgatggcaa agaccccaaa 180 

cttictgcac tgatclccia agaatttaat tcaattaccg cagagaattg ccagaagtgg 240 

gaaaggttgc gcaatgaaaa agatggtagc tgggaatgga aagatagcga tgcctttgtg 300 

iatttcgggg ttgcccataa catgcatatt gtcgggcata cgttgggctg gcatagccaa 360 

attcccgaci gcgtctttaa aaacaaagat ggcagttata tttccaaaga ggcactggca 420 

aaaaaacaac aagagcacat caccacctta gtggatcgtt acaaaggcaa aattgccgca 480 

tgggatgtgg ttaacgaagc catgggcgat gacaacaaga tgcgcgcaag ccattggtac 540 

aacattatqg gtgatgactt tctcgtcaac gcctttaagc tcgcgcatga gactgacccc 600 

laagcacatt tgatgtacaa cgattacaac aacgagcgcc cggaaaagcg cgcagcaacg 660 

gttgatatgc tcaagcgcct gttaaaactc ggggcgccga tccacggttt gggaatgcag 720 

gcacatattg gcctggatgc ggatatgaaa aactttgaag acagtattgt cgcctattca 780 

gaittaggct tgcgtattca ccttaccgaa ctggatatag atgtgttgcc ctcggtgtgg 840 

aatttgccag tcgctgaagt atctacccgt tttgaataca aaccggagcg agatccttac 900 

atcaaaggcc tgccaaaaga gatcgacgaa aaactcgcga aggcttatga atcgctattt 3|0 

laaattttgc ttaagcataa agacaaagta gatcgtgtga ccttctgggg tgtgagtgat 1020 

gatgccagct ggctaaatgg cttcccgatc ccgggccgca ccaattatcc actgttattt 1080 
gaccgtaagc agcaacctia agcagcgtac ttccgcttac tggatttaaa gcgttaa 

<210> 46 

<211> 378 

<212> PRT 

<213> Unknown 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (25) 

Met°Lys 6 lle ser Arg Arg Gin Leu Leu Ala Met Gly Gly Ala Ala Ala 

1 5 10 15 

Thr Leu Ala ser Ala Lys Leu Phe Ala Ala Glu Lys Ala Ala Ala Ala 

20 25 30 

Thr Gly Leu Lys Asp Ala Tyr Lys Asn Asp Phe Leu lie Gly Ala Ala 

35 40 45 

Leu Asn Thr Gin lie Val Asp Gly Lys Asp Pro Lys Leu Thr Ala Leu 

50 55 60 

He Thr Lys Glu Phe Asn ser He Thr Ala Glu Asn cys Gin Lys Trp 
65 70 75 8U 

Glu Arg Leu Arg Asn Glu Lys Asp Gly ser Trp Glu Trp Lys Asp ser 
85 

asd Ala Phe val Asn Phe Gly Val Ala His Asn Met His lie val Gly 

100 105 110 

His Thr Leu Gly Trp His Ser Gin He Pro Asp ser val Phe Lys Asn 

115 120 125 \ 

Lys Asp Gly Ser Tyr He ser Lys Glu Ala Leu Ala Lys Lys Gin Gin 

Glu His He Thr Thr Leu val Asp Arg Tyr Lys Gly Lys lie Ala Ala 
145 150 155 160 

Trp Asp val val Asn Glu Ala Met Gly Asp Asp Asn Lys Met Arg Ala 

165 170 17 

ser His Trp Tyr Asn He Met Gly Asp Asp Phe Leu val Asn Ala Phe 

180 185 190 

Lys Leu Ala His Glu Thr Asp Pro Lys Ala His Leu Met Tyr Asn Asp 
195 200 205 
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Tyr Asn Asn Glu Arg Pro Glu Lys Arg Ala Ala Thr 
210 ~ 215 220 

Lys Arg Leu Leu Lys Leu Gly Ala Pro lie His Gly 
225 230 235 

Ala His lie Gly Leu Asp Ala Asp Met Lys Asn Phe 

245 250 
val Ala Tyr ser Glu Leu Gly Leu Arg lie His Leu 

260 265 
lie Asp Val Leu Pro Ser val Trp Asn Leu Pro Val 

275 280 
Thr Arg Phe Glu Tyr Lys Pro Glu Arg Asp Pro Tyr 
290 295 300 

pro Lys Glu lie Asp Glu Lys Leu Ala Lys Ala Tyr 
305 310 315 

Lys He Leu Leu Lys His Lys Asp Lys Val Asp Arg 

325 330 
Gly Val Ser Asp Asp Ala Ser Trp Leu Asn Gly Phe 

340 345 
Arg Thr Asn Tyr Pro Leu Leu Phe Asp Arg Lys Gin 

355 360 
Ala Tyr Phe Arg Leu Leu Asp Leu Lys Arg 
370 375 

<210> 47 
<211> 1137 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 



Val Asp Met Leu 
Leu Gly 
Glu Asp 



Thr Glu 
270 
Ala Glu 
285 

lie Lys 

Glu ser 

val Thr 

Pro lie 
350 
Gin Pro 
365 



Met Gin 
240 
Ser He 
255 
Leu Asp 



val Ser 

Gly Leu 

Leu Phe 
320 
Phe Trp 
335 

Pro Gly 
Lys Ala 



<400> 47 

atgaaaagaa 

ccgggatgtt 

ttttacatcg 

aaaatggtca 

atccagcgga 

gaaaaacaca 

tggtttttca 

aagaaccata 

gtcaacgaag 

ggtgacgagt 

ctttactata 

gtgaagaacc 

ctcatggact 

ggcgttaagg 

cagggagccg 

ggtttaaccg 

ttcctgaagc 

tcctggaaaa 

aattaccaac 



taaagattct 
ccaatgcaca 
gggctgctct 
ccagacattt 
ccgaagggga 
acatgcacat 
ccggtgcaga 
tttatacggt 
ccattgaaga 
ttgtggaact 
acgactactc 
ttcagtccaa 
cgcccacgct 
tgatgatcac 
atattgccct 
attcagcttc 
accaggacaa 
ataactttcc 
ccaaaccggt 



gaattcgatt 
gaagagcgag 
caataccccc 
taactccatc 
gtttgatttc 
tgtggggcat 
cggaaacgaa 
cgtggggcgt 
caacggctca 
ggcctttaaa 
catggcatta 
gggactcaaa 
ggaagcttat 
ggaactcgat 
gagggctgag 
cgtggcatgg 
aatcagcagg 
gatgagagga 
ggtggaaaga 



gtattagctt 
ccggtgctga 
caaattacgg 
gtagctgaga 
agtcttgccg 
accctgatat 
gtcagccggg 
tacaaaggcc 
tggcgcaaca 
tttgccgcag 
gaaggcagga 
attgacggta 
gaagaaagta 
ttgtctgcgc 
tatgaggcac 
aatcagcgga 
gttacccttt 
aggacagact 
atcatcaaag 



taatcctggc 
aagatgccct 
gccgggatac 
actgcatgaa 
accagtttgt 
ggcattcaca 
aggtactgat 
gtgtccacgg 
gcaagtttta 
aagccgaccc 
gaaatggcgt 
tcggcatgca 
tcctggccta 
tgccatggcc 
ggatgaatcc 
tgggcgattt 
ggggggtcac 
acccgttgct 
aagcgaaagc 



gatcatcctg 
ttcgggaaaa 
cttgtccatg 
aagcggggag 
cgcgttcggc 
ggcgccgcgc 
tgagcgcatg 
ctgggatgtg 
ccagatctta 
ggatgccgaa 
tatcagaatg 
ggggcatctg 
ttccggactg 
agcccgtcag 
ttacaccgaa 
cttctctctt 
cgataaccaa 
ttttgaccgg 
aaaataa 



<210> 48 
<211> 378 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(26) 

<400> 48 

Met Lys Arg lie Lys lie Leu Asn Ser lie Val Leu Ala Leu lie Leu 

1 5 10 15 

Ala lie He Leu Pro Gly Cys Ser Asn Ala Gin Lys ser Glu Pro Val 

20 25 30 

Leu Lys Asp Ala Leu ser Gly Lys Phe Tyr lie Gly Ala Ala Leu Asn 
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35 40 45 

Thr Pro Gin lie Thr Gly Arg Asp Thr Leu Ser Met Lys Met val Thr 

50 55 60 

Arg His Phe Asn ser lie Val Ala Glu Asn cys Met Lys Ser Gly Glu 
65 70 75 80 

lie Gin Arg Thr Glu Gly Glu Phe Asp Phe Ser Leu Ala Asp Gin Phe 

85 90 , , . 95 

val Ala Phe Gly Glu Lys His Asn Met His lie Val Gly His Thr Leu 

100 105 HO 

lie Trp His Ser Gin Ala Pro Arg Trp Phe Phe Thr Gly Ala Asp Gly 

115 120 125 

Asn Glu Val ser Arg Glu Val Leu He Glu Arg Met Lys Asn His He 

130 135 , 140 

Tyr Thr val val Gly Arg Tyr Lys Gly Arg val His Gly Trp Asp val 
145 150 155 160 

Val Asn Glu Ala lie Glu Asp Asn Gly Ser Trp Arg Asn Ser Lys Phe 

165 170 175 

Tyr Gin lie Leu Gly Asp Glu Phe Val Glu Leu Ala Phe Lys Phe Ala 

180 185 190 

Ala Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr Asn Asp Tyr ser Met 

195 200 205 

Ala Leu Glu Gly Arg Arg Asn Gly val lie Arg Met Val Lys Asn Leu 

210 215 220 

Gin ser Lys Gly Leu Lys lie Asp Gly He Gly Met Gin Gly His Leu 
225 230 235 240 

Leu Met Asp ser Pro Thr Leu Glu Ala Tyr Glu Glu Ser lie Leu Ala 

245 250 255 

Tyr ser Gly Leu Gly val Lys Val Met He Thr Glu Leu Asp Leu ser 

260 265 270 

Ala Leu Pro Trp Pro Ala Arg Gin Gin Gly Ala Asp lie Ala Leu Arg 

275 280 285 

Ala Glu Tyr Glu Ala Arg Met Asn Pro Tyr Thr Glu Gly Leu Thr Asp 

290 295 300 

ser Ala Ser val Ala Trp Asn Gin Arg Met Gly Asp Phe Phe ser Leu 
305 310 315 320 

Phe Leu Lys His Gin Asp Lys He ser Arg Val Thr Leu Trp Gly val 

325 330 335 

Thr Asp Asn Gin Ser Trp Lys Asn Asn Phe Pro Met Arg Gly Arg Thr 

340 345 350 

Asp Tyr Pro Leu Leu Phe Asp Arg Asn Tyr Gin Pro Lys Pro Val val 

355 360 365 

Glu Arg lie lie Lys Glu Ala Lys Ala Lys 
370 375 



<210> 49 
<211> 996 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 



atgaacagct ccctcccctc cctccgcgat gtattcgcga atgatttccg catcggggcg 60 

gcggtcaatc ctgtgacgat cgagatgcaa aaacagttgt tgatcgatca tgtcaacagt 120 

attacggcag agaaccatat gaagtttgag catcttcagc cggaagaagg gaaatttacc 180 

tttcaggaag cggatcggat tgtggatttt gcttgttcgc accgaatggc Qgttcgaggg 240 

cacacacttg tatggcacaa ccagactccg gattgggtgt ttcaagatgg tcaaggccat 300 

ttcgtcagtc gggatgtgtt gcttgagcgg atgaaatgtc acatttcaac tgttgtacgg 360 

cgatacaagg gaaaaatata ttgttgggat gtcatcaacg aagcggtagc cgacgaagga 420 

gacgaattgt tgaggccgtc gaagtggcga caaatcatcg gggacgattt tatggaacaa 480 

gcatttctct acgcttatga agctgaccca gatgcactgc ttttttacaa tgactataat 540 

gaatgttttc cggaaaagag agaaaaaatt tttgcacttg tcaaatcgct gcgtgataaa 600 

ggcattccga ttcatggcat cggcatgcag gcgcactgga gcctgacccg cccgtcgctt 660 

gatgaaattc gtgcggcgat tgaacggtat gcgtcccttg fltgttgttct tcatattacg 720 

gaactcgatg tatccatgtt tgaatttcac gatcgtcgaa ccgatttggc tgtcccgacg 780 

aacgaaatga tcgaacagca agcagaacgg tatgggcaaa tttttgcttt gtttaaggag 840 

tatcgcgatg ttattcaaag tgtcacattt tggggaattg ctgatgacca tacatggctc 900 

gataactttc cagtgcacgg gagaaaaaac tggccgcttt tgttcgatga acagcataaa 960 
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ccgaaaccag ctttttggcg ggcagtgagt gtctga 996 

<210> 50 
<211> 331 
<212> PRT 
<213> Unknown 



<220> 

<223> Obtained from an environmental sample 



<400> 50 
















Val Php Ala Acn Acn Php 

v cl i rue t-\ i a mji i n o yr r i ic 




Asn 


Ser 


C c\ r* 


1 Oil 

Leu 


r 1 LI 


Car 
del 




A rn 
Mi y 


Acn 


1 
1 








D 








10 


15 


Arg 


He 


Gly 


A 1 « 

a ia 


Ala 

A i a 


Vtt I 


Asn 


rro 


Va 1 
Vd 1 


"T"h r 
1 II l 


tIp Gin Met" Gl n I vc Gin 

lie VJ I U l"iC L VX III i_y J VJ 1 1 1 














25 




30 


Leu 


Leu 


Tl O 

j. i e 


Asp 


Hi 5 


Val 
Va 1 


Acn 
ASM 


Cop 
DCl 


Tl P 


Th r 
I II I 


Ala Gin Acn Hie Mp1" I vc 

Mid I U MOll nla l v JC L LjrD 






3D 








40 






45 
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u 1 u 


Ui c 
HI 5 


1 Oil 

Leu 


tzl n 
\j i n 


rl ti 


Gl II 


v3 1 U 


o i y 


1 v/c 
Lyo 


Php Thr Php Gin Glu Ala 

rue till r lie va i 1 1 vj i u /-\ i a 














55 




60 


Asp 


Arg 


Tl a 
J. 1 c 


Vd 1 


Acn 


Php 


Ala 
Mid 


uys 


Cor 


Hi c 
n i o 


Arn Mpt* Ala Val Ara Glv 

m i y 1'ic i_ r\ i a v a i mi y vj i jr 








70 










75 80 


Ui c 
Ml 5 


i nr 


Leu 


Vex 1 


Trn 

i rp 
85 


His 
n id 


Acn 

njl I 


Gl n 

S3 1 1 1 


Thr 
■ 1 1 1 


Prn 
90 


Acn Trn Val Php Gin Asd 

njp i i u vai rue vj • 1 1 

95 


Gly 


Gin 


Gly 


U-i c 
Ml o 


Php 


Val 


cior 
del 


A Pfl 

mi y 


Acn 


val 

Vd I 


i pu I pu Glu Ara Met* Lvs 

LCU LCU vj 1 u r\\ y new i-yo 




100 








105 




110 


\-yb 


n 1 5 


Tip 

X 1 C 


dc ■ 


1 1 1 1 


val 

vci i 


Val 

V CL 1 


A rn 
mi y 
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mi y 


Tvr 
ly i 


L vc Glv Lvs lie Tvr Cvs 




JLJLD 










120 






125 


Trn 

i rp 


A en 


Val 


Tip 
J. 1 c 


Acn 

Moll 


Gill 

Vj 1 LI 


Ala 

M 1 GL 


Val 

vai 


Al a 
Mia 


Acn 

mo \j 


Glu Glv Asn Glu Leu Leu 












135 








140 


Arg 


r TO 


Ser 


1 VC 

Lys 


Trn 
i r p 


Arn 
Ml y 


Gl n 

VJ 1 II 


lie 

X 1 c 


lie 


Glv 
\j i y 


asd Asd Phe Met Glu Gin 


lylC 

x4!> 






150 










155 160 


Al a 
A 1 a 


Phe 




Tv/r 

lyr 


Ala 


Tv/r 
i yr 


Gl II 

VJ 1 u 


Ala 

Mia 


Acn 

Abp 


Pro 


Acn Ala Leu Leu Phe Tvr 

/ao Ly r\ i a l.cu lvu r lie ■ y i 








165 

XV J 






170 


175 


Asn 


Asp 


Tyr 


Acn 

/-NO 1 1 


Gl u 


Cvs 


Phe 


Pro 


Glu 


LVS 


Ara Glu Lvs lie Phe Ala 




180 








185 




190 


Leu 


Val 


Lys 


Ser 


Leu 


Arg 


Asp 


Lys 


Gly 


He 


Pro lie His Gly He Gly 






195 






200 






205 


Met 


Gin 


Ala 


His 


Trp 


Ser 


Leu 


Thr 


Arg 


Pro 


ser Leu Asp Glu lie Arg 




210 








215 






220 


Ala 


Ala 


He 


GlU 


Arg 


& 


Ala 


Ser 


Leu 


Gly 


Val val Leu His lie Thr 


225 














235 240 


Glu 


Leu 


Asp 


Val 


ser 


Met 


Phe 


Glu 


Phe 


His 


Asp Arg Arg Thr Asp Leu 








245 










250 


255 


Ala 


val 


Pro 


Thr 
260 


Asn 


Glu 


Met 


He 


Glu 
265 


Gin 


Gin Ala Glu Arg Tyr Gly 
270 


Gin 


He 


Phe 


Ala 


Leu 


Phe 


Lys 


GlU 


Tyr 


Arg 


Asp val lie Gin ser val 






275 








280 


285 


Thr 


Phe 


Trp 


Gly 


lie 


Ala 


Asp 


ASp 


His 


Thr 


Trp Leu Asp Asn Phe Pro 




290 






295 








300 


val 


His 


Gly 


Arg 


Lys 


Asn 


Trp 


Pro 


Leu 


Leu 


Phe Asp Glu Gin His Lys 


305 




310 








315 320 


Pro 


Lys 


Pro 


Ala 


Phe 


Trp 


Arg 


Ala 


val 


Ser 


val 








325 








330 





<210> 51 
<211> 3162 
<212> DMA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 51 

atgagaggga aaagcaaaaa gggatttctg aacatctcag aagctgtact tgttggaatt 60 

ttagcaggct ttcttggagt tcttctcgca getaeggggg ttttgagttt tggtggaaca 120 

gcgtcttcgt ctcttgaaac ggtgttcacc ttgagtttcg agggaacaac gcaaggtgtc 180 

aatccctttg gaaaagaagt agttctcaca gcttctcaag atgtagcagc cgatggcgaa 240 

tattcattga aagtagagaa tagaacttcc ggctgggatg gagttgagat cgatttaacg 300 

gaaaaagtag aagcgaacaa agattatctg ttgtctttct aegtctatea aacatctgac 360 
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tcaccccaac tttttgaagt ccttgcaaga acagaagacg ggaaaggtga aaaatacgaa 420 

acccttaccg acaaggtggt agtatcgaac tactggaaag aaattcttgt gcccttttcc 480 

ccgagtttcg agagtacccc aacaaaatgt tctttgatcg ttgtttcacc aaagaaccca 540 

tcattcactt tctatattga caaggttcaa attctcaaac cgaagaagca aggtccacaa 600 

gtcatttacg aaacatcctt tgagagtggg acgggaagct ggcaagccag agggtctgat 660 

gtgaaaatca aagtgacatc gaaagttgct cattctggaa aaaggtctct ctatgtctcc 720 

aacagacaaa aaggctggca tggtgtacaa cttgacgtga agagactctt gagacccggg 780 

aaaacgtatg cttttgaagg atgggtttat caagactctg gacaggatca aacaattatt 840 

ctgacgatgc agagaagata ttcttctgat tctagcacac aatatgagtg gatcaaggcg 900 

gtaactgttc catcaggaca atggacgcag atctctggaa cttacacaat ccaaccaaga 960 

gtaagcgtgg aggaactcat tgtttacttt gaagccaagg atcccactct tgccttctat 1020 

gtggacgatt tcaaaataac ggataccaca actactgaca tcaagctcga gctgaagcct 1080 

gaagaagaaa ttccagctct taaagaagtg cttggagatt acttcaaagt aggtgttgcc 1140 

ttacctttca aagtttttgc caaaccagag gatattgctc tcattactaa acatttcaac 1200 

agcatcactg ccgaaaacga aatgaaacct gagagtctct tggctggcgt agaaaatgga 1260 

aagttgaagt tcaggtttga gacagcagac aaatacgtag aatttgcaca gcaaaacggt 1320 

atggttgtga gaggtcacac tctggtgtgg cacaatcaaa caccggactg gttcttcaag 1380 

gacgagaacg gaaatctgct ctccaaagaa gcaatgactg aaaggcttag ggaatacatc 1440 

cacacagtcg tcggacactt caaaggcaaa gtttacgcgt gggacgtcgt taatgaggca 1500 

gtagatccat cccaaccaga tggacttaga agatctatat ggtacgaaat catgggacct 1560 

gactatatag aacttgcatt caagtttgca agagaagcgg accccaatgc aaagctcttc 1620 

tacaacgact acaacaccta ccaggagaag aagagagaca tcatttacaa cctcgtcaaa 1680 

tccctcaaag agaagggact cattgacggt atcggtatgc agtgtcatat cggtgttggg 1740 

accagtgtca aagagattga agaggcaatc aaaaaattca gcaccattcc aggtatcgaa 1800 

attcatatca cggaactaga tataagtgtg tacgaggatg cgacttccaa ttatccaaca 1860 

cctccaaggg aggctctcat taaacaagca cacgtaatga gagaactctt tgccatcttc 1920 

aaaaagtaca gcaacgtcat aacaaacgtt actttctggg gattgaaaga tgattattcc 1980 

tggaagaatg cccgcagaaa cgactggccg ctactttttg ataaagacta ccaagccaaa 2040 

cttgcctact gggccatagt cagtcctgag gctctaccgg tgcttccaaa gaaatggtct 2100 

atcgctacag gtagtgcttt ggtagttgga atgatggatg actcctattt ggcttcttca 2160 

cctatcaaaa ttctcgtcga tggccaagaa aaactcacag ccagagtcat ctgggaagaa 2220 

aacaaactct tcgtctacgc agaggtctat gacaggacaa gagacaaagg aaaggacggt 2280 

atcaccatct ttgtggatcc taaaaacttc aaggcacctt acttgcatga agatgctttc 2340 

tacgttacca taaaaaccga ctggagtgtt gagaagagtc gtgatgacat agaagtccag 2400 

agattcgtag gtccaagtgg agtaaggtac aacgttgaat gtgaaataac acttcctgaa 2460 

aaactccagg aaggacagca aatcggattt gatatcgccg tccaggatgg cgataaggtc 2520 

tacagctggt ctgatacatc caatcagcag aagctcgcaa ccatgaacta cggaactctc 2580 

actctgcagg gtgcggtaat ggccacagct aagtatggca cacctgtgat cgatggtgaa 2640 

atagatgaca tctggtacac cactgaagaa atctcaaccg atgttgttgt catgggttca 2700 

ctcaagaacg caagggcaaa agtgagagtg ctctgggatg aagagcacct ctatgtgctt 2760 

gccatcgtaa ccgatcctgt gctcaataag gacaacacca atccatggga acaagactct 2820 

gtagaaatct tcatagacga aaacaacgcc aaaacaccgt actatcagga cgatgatgct 2880 

caatatcgtg tcaactacct caacgaacaa tccttcggta caggtgcaag cagcaagaac 2940 

ttcaagacag ccgtgaaact catcgatggt ggttatcttg ttgaggcagc ggttaaatgg 3000 

aagaccatca aaccttcacc aaacacagtg ataggctttg atttccaggt gaacgatgca 3060 

aatgctcaag gtaagagagt tggaatactt aagtggtgcg atccaacgga caacagctgg 3120 

cagaatacct ccaagtttgg taatctcagg ttgataaaat ag 3162 

<210> 52 
<211> 1053 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C30) 

<400> 52 

Met Arg Gly Lys Ser Lys Lys Gly Phe Leu Asn lie Ser Glu Ala val 

15 10 15 

Leu val Gly lie Leu Ala Gly Phe Leu Gly val Leu Leu Ala Ala Thr 

20 25 30 

Gly val Leu ser Phe Gly Gly Thr Ala Ser Ser Ser Leu Glu Thr val 

35 40 45 

Phe Thr Leu ser Phe Glu Gly Thr Thr Gin Gly val Asn Pro Phe Gly 

50 55 60 

Lys Glu Val val Leu Thr Ala Ser Gin Asp Val Ala Ala Asp Gly Glu 
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65 70 75 80 

Tyr ser Leu Lys Val Glu Asn Arg Thr ser Gly Trp Asp Gly val Glu 

85 90 95 

lie Asp Leu Thr Glu Lys Val Glu Ala Asn Lys Asp Tyr Leu Leu Ser 

100 * 105 HO 

Phe Tyr val Tyr Gin Thr Ser Asp Ser Pro Gin Leu Phe Glu Val Leu 

115 120 125 

Ala Arg Thr Glu Asp Gly Lys Gly Glu Lys Tyr Glu Thr Leu Thr Asp 

130 135 140 

Lys Val Val Val Ser Asn Tyr Trp Lys Glu lie Leu val Pro Phe ser 
145. 150 155 160 

Pro Ser Phe Glu ser Thr Pro Thr Lys Cys ser Leu lie Val val Ser 

165 170 175 

Pro Lys Asn Pro Ser Phe Thr Phe Tyr lie Asp Lys Val Gin lie Leu 

180 185 190 

Lys Pro Lys Lys Gin Gly Pro Gin Val lie Tyr Glu Thr ser Phe Glu 

195 200 205 

Ser Gly Thr Gly Ser Trp Gin Ala Arg Gly Ser Asp Val Lys lie Lys 

210 215 220 

val Thr ser Lys val Ala His Ser Gly Lys Arg Ser Leu Tyr Val Ser 
225 230 235 240 

Asn Arg Gin Lys Gly Trp His Gly Val Gin Leu Asp Val Lys Arg Leu 

245 250 255 

Leu Arg Pro Gly Lys Thr Tyr Ala Phe Glu Gly Trp val Tyr Gin Asp 

260 265 270 

Ser Gly Gin Asp Gin Thr lie He Leu Thr Met Gin Arg Arg Tyr Ser 

275 280 285 

ser Asp ser Ser Thr Gin Tyr Glu Trp lie Lys Ala val Thr Val Pro 

290 295 300 

ser Gly Gin Trp Thr Gin lie Ser Gly Thr Tyr Thr lie Gin Pro Arg 
305 310 315 320 

val Ser val Glu Glu Leu lie Val Tyr Phe Glu Ala Lys Asp Pro Thr 

325 330 335 

Leu Ala Phe Tyr val Asp Asp Phe Lys lie Thr Asp Thr Thr Thr Thr 

340 345 350 

Asp lie Lys Leu Glu Leu Lys Pro Glu Glu Glu lie Pro Ala Leu Lys 

355 360 365 

Glu val Leu Gly Asp Tyr Phe Lys Val Gly Val Ala Leu Pro Phe Lys 

370 375 380 

Val Phe Ala Lys Pro Glu Asp lie Ala Leu lie Thr Lys His Phe Asn 
385 390 395 400 

Ser He Thr Ala Glu Asn Glu Met Lys Pro Glu Ser Leu Leu Ala Gly 

405 410 415 

Val Glu Asn Gly Lys Leu Lys Phe Arg Phe Glu Thr Ala Asp Lys Tyr 

420 425 430 

val Glu Phe Ala Gin Gin Asn Gly Met Val val Arg Gly His Thr Leu 

435 440 445 

val Trp His Asn Gin Thr Pro Asp Trp Phe Phe Lys Asp Glu Asn Gly 

450 455 460 

Asn Leu Leu Ser Lys Glu Ala Met Thr Glu Arg Leu Arg Glu Tyr lie 
465 470 475 480 

His Thr val Val Gly His Phe Lys Gly Lys Val Tyr Ala Trp Asp Val 

485 490 495 

Val Asn Glu Ala Val Asp Pro Ser Gin Pro Asp Gly Leu Arg Arg ser 

500 505 510 

He Trp Tyr Glu lie Met Gly pro Asp Tyr lie Glu Leu Ala Phe Lys 

515 520 525 

Phe Ala Arg Glu Ala Asp Pro Asn Ala Lys Leu Phe Tyr Asn Asp Tyr 

530 535 540 

Asn Thr Tyr Gin Glu Lys Lys Arg Asp lie lie Tyr Asn Leu Val Lys 
545 550 ~ 555 560 

Ser Leu Lys Glu Lys Gly Leu He Asp Gly lie Gly Met Gin cys His 

565 570 575 

He Gly val Gly Thr ser val Lys Glu lie Glu Glu Ala lie Lys Lys 

580 585 590 

Phe ser Thr lie Pro Gly lie Glu lie His lie Thr Glu Leu Asp lie 

595 600 605 

Ser val Tyr Glu Asp Ala Thr Ser Asn Tyr Pro Thr Pro Pro Arg Glu 
610 615 620 
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Ala Leu lie Lys Gin Ala His val Met Arg Glu Leu Phe Ala He Phe 
625 630 635 , 640 

Lys Lys Tyr Ser Asn val lie Thr Asn val Thr Phe Trp Gly Leu Lys 

645 650 655 

Asp Asp Tyr ser Trp Lys Asn Ala Arg Arg Asn Asp Trp Pro Leu Leu 

660 665 , 670 

Phe Asp Lys Asp Tyr Gin Ala Lys Leu Ala Tyr Trp Ala He Val ser 

675 680 685 

pro Glu Ala Leu Pro Val Leu Pro Lys Lys Trp ser He Ala Thr Gly 

690 695 700 

ser Ala Leu Val Val Gly Met Met Asp Asp ser Tyr Leu Ala ser Ser 
705 710 715 /^y 

pro He Lys He Leu val Asp Gly Gin Glu Lys Leu Thr Ala Arg val 

725 730 735 

lie Trp Glu Glu Asn Lys Leu Phe val Tyr Ala Glu val Tyr Asp Arg 

740 745 „ 750 

Thr Arg Asp Lys Gly Lys Asp Gly lie Thr lie Phe Val Asp Pro Lys 

755 7fi 0 '65 

Asn Phe Lys Ala Pro Tyr Leu His Glu Asp Ala Phe Tyr Val Thr lie 

770 775 ^80 

Lys Thr Asp Trp ser val Glu Lys Ser Arg Asp Asp lie Glu val Gin 
785 790 795 800 

Arg Phe val Gly Pro Ser Gly Val Arg Tyr Asn val Glu cys Glu He 

805 810 815 

Thr Leu Pro Glu Lys Leu Gin Glu Gly Gin Gin lie Gly Phe Asp He 

820 825 830 

Ala Val Gin Asp Gly Asp Lys val Tyr ser Trp ser Asp Thr Ser Asn 

835 840 845 

Gin Gin Lys Leu Ala Thr Met Asn Tyr Gly Thr Leu Thr Leu Gin Gly 

850 855 860 

Ala val Met Ala Thr Ala Lys Tyr Gly Thr Pro val He Asp Gly Glu 
865 870 875 , 880 

He Asp Asp lie Trp Tyr Thr Thr Glu Glu He ser Thr Asp Val Val 

885 890 895 

Val Met Gly ser Leu Lys Asn Ala Arg Ala Lys Val Arg Val Leu Trp 

900 905 910 

Asp Glu Glu His Leu Tyr val Leu Ala lie Val Thr Asp Pro val Leu 

915 920 925 

Asn Lys Asp Asn Thr Asn Pro Trp Glu Gin Asp ser Val Glu lie Phe 

930 935 940 

He Asp Glu Asn Asn Ala Lys Thr Pro Tyr Tyr Gin Asp Asp Asp Ala 
945 950 955 you 

Gin Tyr Arg val Asn Tyr Leu Asn Glu Gin ser Phe Gly Thr Gly Ala 

965 970 „ 975 

ser ser Lys Asn Phe Lys Thr Ala val Lys Leu He Asp Gly Gly Tyr 

980 985 990 

Leu Val Glu Ala Ala val Lys Trp Lys Thr He Lys Pro Ser Pro Asn 

995 1000 1005 

Thr Val He Gly Phe Asp Phe Gin Val Asn Asp Ala Asn Ala Gin Gly 

1010 1015 1020 

Lys Arg val Gly He Leu Lys Trp Cys Asp Pro Thr Asp Asn Ser Trp 
1025 1030 1035 1040 

Gin Asn Thr Ser Lys Phe Gly Asn Leu Arg Leu He Lys 
1045 1050 

<210> 53 
<211> 2370 
<212> DNA 
<213> Bacteria 

atqaagggcc tgcaccggct ccgccgccgt cgccggacct gggtggcagg actgtcggcc 60 
gcggcggtgg tcgccggcgc cctgacgctc ctccccggct ccgccggcgc cgcgggcctg 120 
ggtacgcacg cggccccctc gggccggtac ttcggcacgg ccgtggccgc gggccgcctc 180 
qqcgactcgg cgtacaccgc gatcgccgac cgggagttca acatgatcac cccggagaac 240 
gagatgaagt gggacgccgt cgagccgtcc cgcggccgtt tcgacttcgg tcccgcggac 300 
cqgatcgtcg agcgtgccct ggcacgcggc cagcgcgtcc gcggccacac cacggtctgg 360 
cactcgcagc tcccctcctg ggtgggctcc atccgcgaca cgaagacgct gcgcggcgtg 420 
atqaaccacc acatcaccac ccagatgacc cactacaagg gcaagatcta cgcctgggac 480 
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gtggtcaacg aggccttcgc cgatggcggc agcggccggc tccgcgactc ggtcttccag 540 

iaggtgctgg gcgacggctt catcgaggag gcgttccgca ccgcccgcgc ggccgacccc 600 

tcggccaagc tctgctacaa cgactacaac atcgagaact ggtcggacgc caagacccag 660 

ggcgtctacc gcctggtgaa ggacttcacg tcccgaggcg ttcccatcga ctgcgtcggc 720 

ttccagagcc acttcggcgc gggcggcccg ccggcgagct tcaagacgac cctggccaac 780 

ttcgccgccc tgggcgtcga cgtccagatc accgagctgg acatcgccca ggcatcacct 840 

gcccactacg cgagtgcggt cagcacctgt ctgtccgtgg cccggtgcac cggcatcacg 900 

gtgtggggcg tccgtgacag cgactcctgg cggagcgccg aaagcccgct gctgttcgac 960 

cggaacggci agcccaagcc cgcgtacgcc gccgtcatga acgccctcgg ctccggctcg 1020 

qgtcccaccc cgagcaagcc ggccgacggt acggggagcg gtacggggga gatcaagggc 1080 

gtggcctccg gccgctgtct ggacgtcccc gcctccacca ccgccaacgg cacccgggcg 1140 

cagctgtggg actgtagcgg ccaggccaac cagcgctgga cccacaccgc cggcaagcag 1200 

ctgaagatcc acggcgacaa gtgcctggac gccaagggca agggcaccgc caacggcacc 1260 

gcggtggtcg tctgggactg caacggcggc accaaccagc agtggaacgt ccacaccgac 1320 

ggcacqatca ccggcgtcca gtccggtctg tgcctcgatg ccgtcggcgc ggccaccgcc 1380 

aacggcaccc cgatccagct gcacgcctgt gggggtgtcg gcaaccagaa gtggtccgcc 1440 

ccgtccggat cgggcggcgg cacgtgcgtg cttccgtcga cgtacaagtg gagctcgacg 1500 

ggtgccctgg cgcagcccaa ggccgggtgg gcctcgctga aggacttcac ccatgtggtg 1560 

ctgggcggca agcacctggt ctacgggtcg aacttcaacg gatcgacgta cggctcgatg 1620 

acgttcagcc ccttcaccac ctggtcggac atggcgtccg caggacagaa ggcgatgaag 1680 

cagcccgcgg tcgcacccac cctgttctac ttcgcaccca agaagatctg ggtgctggcg 1740 

taccagtggg gcaggaccgc gttctcctac cggacgtcga ccgaccccac caacccgaac 1800 

ggctggtcgg cggagcagga gctcttctcc ggaagcatca ccggctcggg cacgggcccc 1860 

atcgaccaga cgctcatcgg cgacgggacg aacatgtacc tgttcttcgc cggtgacaac 1920 

ggcaagatct accgggccag catgccgatc gggaacttcc cgggcagctt cggctcctcg 1980 

tacacgacgg tcatgagcga caccgcgaag aacctgttcg aggcgccgca ggtgtacaag 2040 

gtcaaggacc agaaccagta cctcatgatc gtcgaggccc ggggcgcggg cgagcgccgc 2100 

tacttccgct cgttcacggc ctccagcctg agcggtgcgt ggaccccgca ggccgcgacc 2160 

gagagcaacc ccttcgcggg caaggccaac agcggcgcca cctggaccga cgacatcagc 2220 

cacggtgatc tgatccgcac caaccccgat cagaccatga ccatcgaccc ctgcaacctt 2280 

cagctgctct accagggcaa gtccccgcag gcgggcggac cctacgacca gctgccgtac 2340 
cggccgggcg tcctcaccct gcagcgctga 

<210> 54 
<211> 787 
<212> PRT 
<213> Bacteria 

<220> 

<221> SIGNAL 
<222> (D...C37) 

Met°Lys 4 Gly Leu His Arg Leu Arg Arg Arg Arg Arg Thr Trp Val Ala 

15 10 15 

Gly Leu ser Ala Ala Ala val val Ala Gly Ala Leu Thr Leu Leu Pro 

20 25 30 

Gly ser Ala Gly Ala Ala Gly Leu Gly Thr His Ala Ala Pro ser Gly 

35 40 45 

Arg Tyr Phe Gly Thr Ala Val Ala Ala Gly Arg Leu Gly Asp ser Ala 

50 " 55 60 

Tyr Thr Ala He Ala Asp Arg Glu Phe Asn Met lie Thr Pro Glu Asn 
65 70 75 80 

Glu Met Lys Trp Asp Ala val Glu Pro Ser Arg Gly Arg Phe Asp Phe 

85 90 95 

Gly Pro Ala Asp Arg He Val Glu Arxj Ala Leu Ala Arg Gl^ Gin Arg 

Val Arg Gly His Thr Thr val Trp His ser Gin Leu Pro Ser Trp Val 

115 120 125 

Glv ser He Arg Asp Thr Lys Thr Leu Arg Gly Val Met Asn His His 

130 135 140 

lie Thr Thr Gin Met Thr His Tyr Lys Gly Lys lie Tyr Ala Trp Asp 
145 150 155 160 

val val Asn Glu Ala Phe Ala Asp Gly Gly Ser Gly Arg Leu Arg Asp 

165 170 175 

Ser Val Phe Gin Lys Val Leu Gly Asp Gly Phe He Glu Glu Ala Phe 

180 185 190 

Arg Thr Ala Arg Ala Ala Asp Pro Ser Ala Lys Leu cys Tyr Asn Asp 
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Tyr Asn lie Glu Asn Trp ser Asp Ala Lys Thr Gin Gly Val Tyr Arg 
210 215 220 

val Lys Asp Phe Thr Ser Arg Gly val Pro lie 
225 230 235 

Phe Gin ser His Phe Gly Ala Gly Gly Pro Pro Ala Ser Phe Lys Thr 
■>/lk ?Rn 255 



210 n 
Leu val Lys Asp Phe Thr ser Arg Gly val Pro He Asp cys Val Gly 
225 230 235 240 

Phe Gin ser His Phe Gly Ala Gly Gly Pro Pro 

245 250 
Thr Leu Ala Asn Phe Ala Ala Leu Gly Val Asp val Gin lie Thr Glu 

260 265 270 

Leu Asp lie Ala Gin Ala Ser Pro Ala His Tyr Ala Ser Ala val Ser 

275 280 285 

Thr cys Leu ser val Ala Arg Cys Thr Gly He Thr Val Trp Gly Val 

290 295 300 

Arg Asp ser Asp Ser Trp Arg Ser Ala Glu ser Pro Leu Leu Phe Asp 
305 310 315 320 

Arg Asn Gly Lys Pro Lys Pro Ala Tyr Ala Ala val Met Asn Ala Leu 
325 330 335 



325 3jU 333 

Gly Ser Gly Ser Gly Pro Thr Pro ser Lys Pro Ala Asp Gly Thr Gly 

340 345 350 

Ser Gly Thr Gly Glu lie Lys Gly Val Ala Ser Gly Arg Cys Leu Asp 

355 360 , 365 

val Pro Ala Ser Thr Thr Ala Asn Gly Thr Arg Ala Gin Leu Trp Asp 

370 375 380 

cys Ser Gly Gin Ala Asn Gin Arg Trp Thr His Thr Ala Gly Lys Gin 
385 390 395 400 

Leu Lys lie His Gly Asp Lys Cys Leu Asp Ala Lys Gly Lys Gly Thr 
405 410 415 



420 HJU 
Gin Gin Trp Asn Val His Thr Asp Gly Thr He Thr Gly val Gin Ser 

435 440 445 

Glv Leu Cys Leu Asp Ala val Gly Ala Ala Thr Ala Asn Gly Thr Pro 

450 455 460 

He Gin Leu His Ala cys Gly Gly val Gly Asn Gin Lys Trp Ser Ala 
465 470 475 480 



465 4/u h/3 ™« 

Pro ser Gly Ser Gly Gly Gly Thr Cys Val Leu Pro Ser Thr Tyr Lys 

485 490 495 

Trp ser Ser Thr Gly Ala Leu Ala Gin Pro Lys Ala Gly Trp Ala ser 
500 505 510 



5UU 3U3 J - LV 

Leu Lys Asp Phe Thr His Val Val Leu Gly Gly Lys His Leu val Tyr 

515 520 525 

Gly ser Asn Phe Asn Gly ser Thr Tyr Gly ser Met Thr Phe Ser Pro 

530 535 540 

Phe Thr Thr Trp Ser Asp Met Ala Ser Ala Gly Gin Lys Ala Met Lys 
545 550 555 560 

Gin Pro Ala Val Ala Pro Thr Leu Phe Tyr Phe Ala Pro Lys Lys lie 

565 570 575 

Trp Val Leu Ala Tyr Gin Trp Gly Arg Thr Ala Phe Ser Tyr Arg Thr 

580 585 590 

Ser Thr Asp pro Thr Asn Pro Asn Gly Trp ser Ala Glu Gin Glu Leu 

595 600 605 

Phe ser Gly ser lie Thr Gly ser Gly Thr Gly Pro He Asp Gin Thr 

610 615 620 

Leu lie Gly Asp Gly Thr Asn Met Tyr Leu phe Phe Ala Gly Asp Asn 
625 630 635 640 

Gly Lys He Tyr Arg Ala ser Met Pro He Gly Asn Phe Pro Gly Ser 

645 650 655 

Phe Gly ser ser Tyr Thr Thr val Met Ser Asp Thr Ala Lys Asn Leu 

660 665 670 

phe Glu Ala Pro Gin Val Tyr Lys Val Lys Asp Gin Asn Gin Tyr Leu 

675 680 685 

Met lie Val Glu Ala Arg Gly Ala Gly Glu Arg Arg Tyr Phe Arg ser 

690 695 700 

Phe Thr Ala Ser Ser Leu Ser Gly Ala Trp Thr Pro Gin Ala Ala Thr 
705 710 715 720 

Glu ser Asn Pro Phe Ala Gly Lys Ala Asn ser Gly Ala Thr Trp Thr 

725 730 735 

Asp Asp lie ser His Gly Asp Leu lie Arg Thr Asn Pro Asp Gin Thr 

740 745 750 

Met Thr He Asp Pro Cys Asn Leu Gin Leu Leu Tyr Gin Gly Lys ser 
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755 760 765 

pro Gin Ala Gly Gly Pro Tyr Asp Gin Leu Pro Tyr Arg Pro Gly val 

770 ' 775 780 

Leu Thr Leu 
785 

<210> 55 
<211> 1143 
<212> DNA 
<213> Unknown 

<220> . - i 

<223> Obtained from an environmental sample 

atgalaaaaa cgattgcaca tttcacctta tggatagcgt tttttctctt cacttcctgt 60 
gctgttacgg cgcagaagaa tactaagaat gcaagagtaa agcccactac tctaaaagag 120 
qcttaccaag gtaaattcta tatcggtaca gcgatgaatc tgagacagat tcacggagat 180 
aatccccagt ctgaaaatat tatcaiaaaa cagttcaatt ccattgttgc tgaaaactgc 240 
l?5aagagta ^tatcttca gccggaggaa ggaaaatttt tcttcgatga tgcggataag 300 
tttgtggatt ttggtcttca gaacaatatg tttattatcg ggcattgtct gatttggcat 360 
tcgcaggcgc caaaatggtt tttcaccgac gagaatggga aaacggtctc cccagaagtt 420 
cttaaicaaa ggatgaaagc tcatatcacc gccgtcgttt ctcgctacaa agggaaaatc 480 
aaaggatggg atgtggtgia cgaagccatt atggaagatg gttcttaccg caaaagcaaa 540 
ttttacgiga ttttgggaga agaatttatt ccgttggcat ttcagtatgc gcatgaagca 600 
qatcctgatg cagaictcta ttacaacgat tataacgaat ggtatcccgg aaaaagagct 660 
acggtgacca aaataatccg agatttcaaa tctagaggaa tccgcattga tgccattgga 720 
atqcaagctc atttcgggat ggattcaccc actatagaag agtatgaaca aactattcag 780 
ggcStataa aagaaggcgt giaagtcaat attacggaac tcgatttgag tccacttcct 840 
tccccttggg gaacttccgc caacgttgcc gatacgcagc agtatcagga aaaaatgaat 900 
ccttacacca aaggacttcc cacagaggtg gaaaaagctt 99gaaaaccg ttatctcgat 960 
tttttcaaac tattcctaaa atatcatcag catatcgagc gtgttacgtt ttggggcgtt 1020 
agcgatatcg attcctggaa gaacgatttt ccagtgagag gacgtaccga ttatccgtta 1080 
ccctttgacc gacagtatca ggcaaaacct ttggttcaga aattaataga cttaacgaaa 1140 
tag 

<210> 56 
<211> 380 
<212> PRT 
<213> Unknown 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C24) 

Met°Lyi 6 Lys Thr lie Ala His Phe Thr Leu Trp He Ala Phe Phe Leu 
1 5 10 15 

Phe Thr Ser cys Ala val Thr Ala Gin Lys Asn Thr Lys Asn Ala Arg 

20 25 30 

val Lys Pro Thr Thr Leu Lys Glu Ala Tyr Gin Gly Lys Phe Tyr He 

35 40 45 

Gly Thr Ala Met Asn Leu Arg Gin He His Gly Asp Asp Pro Gin ser 

50 55 60 

Glu Asn lie He Lys Lys Gin Phe Asn ser lie val Ala Glu Asn cys 
65 70 75 80 

Met Lys Ser Met Tyr Leu Gin Pro Glu Glu Gly Lys Phe Phe Phe Asp 
8d 

Asp Ala Asp Lys Phe val Asp Phe Gly Leu Gin Asn Asn Met Phe lie 

100 105 110 , . 

He Gly His cys Leu He Trp His Ser Gin Ala Pro Lys Trp Phe Phe 

115 120 125 

Thr Asp Glu Asn Gly Lys Thr val ser Pro Glu val Leu Lys Gin Arg 

130 135 140 

Met Lys Ala His He Thr Ala val val ser Arg Tyr Lys Gly Lys lie 
•%ac 150 155 J-OU 

Lvs Gly Trp Asp val Val Asn Glu Ala He Met Glu Asp Gly Ser Tyr 
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165 170 , 175 

Arg Lys Ser Lys Phe Tyr Glu lie Leu Gly Glu Glu Phe lie Pro Leu 

Ala Phe Gin Tyr Ala His Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr 

195 200 205 

Asn Asp Tyr Asn Glu Trp T^r Pro Gly Lys Arg Ala Thr Val Thr Lys 

lie He Arg Asp Phe Lys ser Arg Gly lie Arg lie Asp Ala He Gly 
225 230 235 240 

Met Gin Ala His Phe Gly Met Asp Ser Pro Thr He Glu Glu Tyr Glu 

245 250 255 

Gin Thr lie Gin Gly Tyr He Lys Glu Gly val Lys val Asn lie Thr 

260 265 270 

Glu Leu Asp Leu Ser Pro Leu pro Ser pro Trp Gly Thr ser Ala Asn 

275 280 285 

Val Ala Asp Thr Gin Gin Tyr Gin Glu Lys Met Asn Pro Tyr Thr Lys 

290 295 300 

Gly Leu Pro Thr Glu val Glu Lys Ala Trp Glu Asn Arg Tyr Leu Asp 
305 310 315 320 

Phe Phe Lys Leu Phe Leu Lys Tyr His Gin His He Glu Arg val Thr 

325 330 335 

Phe Trp Gly Val Ser Asp lie Asp ser Trp Lys Asn Asp Phe Pro Val 

340 345 350 

Arq Gly Arg Thr Asp Tyr Pro Leu Pro Phe Asp Arg Gin Tyr Gin Ala 

S 355 360 365 

Lys Pro Leu val Gin Lys Leu He Asp Leu Thr Lys 
' 370 375 380 

<210> 57 
<211> 1578 
<212> DNA 
<213> unknown 

< 220> 

<223> Obtained from an environmental sample 

atgaaaagaa tgatcggttt gctgctggcc atttgcctgg tgatgacgct ggctggggcc 60 
tgggctgcct cggatacgct ggtctatgca tccagtttcg cagcgggcga tgacgactgg 120 
tttqcaaggg gcgcttcccg ggtttaccat accacggagg cgacgctgcg gacggaaggc 180 
cggagcgaca actggaattc tccgggacgc tattttgaac tggtgccgga taatgaatat 240 
acgctgagcg tggaggtcta ccaggacgga gcggacagcg cgaacttcat gatttccctg 300 
gaaaaggttg cggatgggat caccggatgg gaaaacctgg tgcggggaac cgtgaaaaag 360 
ggtgaatgga cgacgctgtc cggaacctat acttttgcag actatgaaag ctatgtgctg 420 
tatgtggaga cctccgacgc gccgacgctg gactttgaga tccggaattt ccgggtggaa 480 
agccccaatg ggatcccgga gccgaaggct accgaggcgc cggcagtggt ttcggaagcc 540 
acggatattc cgagcctgaa ggacgcttac gcggattact tcgactttgg cgcggccgtg 600 
ccgcagtctg ctttcaccag cagagataat attcagctga tggagctgat gaaaaaccag 660 
ttcagcatcc tgacgcctga aaatgagctg aagccggaca gtgtattgga tgtaagcgcc 7/0 
agcaagcagc tggccaaaga ggatgaaacc gcggtagtgg tgcggtttaa cggggcaaag 780 
tcattgctgc ggtttgccca gcaaaacggc atcaaggtgc acgggcatgt gctggtctgg 840 
cacagccaga cgccggaagc ctttttccat gaaggatatg atcccaagaa cccgctggtg 900 
agccgggaag tgatgctggg acggctggaa aactatatcc gggaagtgct gacccagacg 960 

gaagaactgt atccgggcgt gatcgtcagc tgggacgtgg tgaacgaagc gattgacgac 1020 

ggaaccaact ggatccggaa gggatcgggc tggtaccgga ccatcgggga agactatgtg 1080 

gagaaggctt ttgagtttgc ccggaagtat gccccggaag gcgtgctgct gtactacaac 1140 

gattacaaca cggcatacgc cggaaaactg aatgggatta tcaaactgat caaacccatg 1200 

atcgagcagg gaacgatcga cggatacggc ttccagatgc accatacgac cgggcagccc 1260 

agcaaccaga tgatcaccac ggcggtggag aagatcgcgg ccctgggaat caagctgcgg 1320 

qtcagcgaga tggacatcgg gattacaaag tatacagaga cgagcctgca ggcacaaaag 1380 

gacaagtaca aggcgatgat ggaactgatg ctgcggttcg cggaccagac ggaagcagtg 1440 

caggtctggg ggattacgga tacgatgagc tggcggagct ccagctatcc gctgctgttt 1500 

gaccggagca ggaatccgaa gccggcgttc tatggcgtga ttgaagcggt tgaagactgg 1560 

acagggaaaa gtgaatag 15 78 

<210> 58 
<211> 525 
<212> PRT 
<213> Unknown 
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<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (l)-.-(22) 

<400> 58 

Met Lys Arg Met lie Gly Leu Leu Leu Ala lie Cys Leu Val Met Thr 
15 10 15 

Leu Ala Gly Ala Trp Ala Ala ser Asp Thr Leu val Tyr Ala ser ser 

20 25 30 

Phe Ala Ala Gly Asp Asp Asp Trp Phe Ala Arg Gly Ala Ser Arg Val 

35 40 45 

Tyr His Thr Thr Glu Ala Thr Leu Arg Thr Glu Gly Arg Ser Asp Asn 

50 55 60 

Trp Asn ser Pro Gly Arg Tyr Phe Glu Leu Val Pro Asp Asn Glu Tyr 
65 70 75 80 

Thr Leu ser val Glu val Tyr Gin Asp Gly Ala Asp ser Ala Asn Phe 

85 90 95 

Met lie ser Leu Glu Lys Val Ala Asp Gly lie Thr Gly Trp Glu Asn 

100 105 110 

Leu Val Arg Gly Thr Val Lys Lys Gly Glu Trp Thr Thr Leu Ser Gly 

115 120 125 

Thr Tyr Thr Phe Ala Asp Tyr Glu Ser Tyr val Leu Tyr val Glu Thr 

130 135 140 

Ser Asp Ala Pro Thr Leu Asp Phe Glu lie Arg Asn Phe Arg Val Glu 
145 150 155 160 

Ser Pro Asn Gly lie Pro Glu Pro Lys Ala Thr Glu Ala Pro Ala Val 

165 170 175 

Val Ser Glu Ala Thr Asp lie Pro Ser Leu Lys Asp Ala Tyr Ala Asp 

180 185 190 

Tyr Phe Asp Phe Gly Ala Ala val Pro Gin ser Ala Phe Thr ser Arg 

195 200 205 

Asp Asn lie Gin Leu Met Glu Leu Met Lys Asn Gin Phe Ser lie Leu 

210 215 220 

Thr Pro Glu Asn Glu Leu Lys Pro Asp Ser Val Leu Asp val ser Ala 
225 230 235 240 

Ser Lys Gin Leu Ala Lys Glu Asp Glu Thr Ala Val Val val Arg Phe 

245 250 255 

Asn Gly Ala Lys ser Leu Leu Arg Phe Ala Gin Gin Asn Gly lie Lys 

260 265 270 

val His Gly His Val Leu Val Trp His Ser Gin Thr Pro Glu Ala Phe 

275 280 285 

Phe His Glu Gly Tyr Asp pro Lys Asn Pro Leu Val Ser Arg Glu Val 

290 295 300 

Met Leu Gly Arg Leu Glu Asn Tyr lie Arg Glu Val Leu Thr Gin Thr 
305 310 315 320 

Glu Glu Leu Tyr pro Gly val lie Val ser Trp Asp val val Asn Glu 

325 330 335 

Ala lie Asp Asp Gly Thr Asn Trp lie Arg Lys Gly Ser Gly Trp Tyr 

340 345 350 

Arg Thr lie Gly Glu Asp Tyr Val Glu Lys Ala Phe Glu Phe Ala Arg 

355 360 365 

Lys Tyr Ala Pro Glu Gly Val Leu Leu Tyr Tyr Asn Asp Tyr Asn Thr 

370 375 380 

Ala Tyr Ala Gly Lys Leu Asn Gly lie lie Lys Leu lie Lys Pro Met 
385 390 395 400 

lie Glu Gin Gly Thr lie Asp Gly Tyr Gly Phe Gin Met His His Thr 

405 410 415 

Thr Gly Gin Pro ser Asn Gin Met He Thr Thr Ala Val Glu Lys He 

420 425 430 

Ala Ala Leu Gly lie Lys Leu Arg val Ser Glu Met Asp lie Gly He 

435 440 445 

Thr Lys Tyr Thr Glu Thr ser Leu Gin Ala Gin Lys Asp Lys Tyr Lys 

450 455 460 

Ala Met Met Glu Leu Met Leu Arg Phe Ala Asp Gin Thr Glu Ala Val 
465 470 ~ 475 480 

Gin Val Trp Gly He Thr Asp Thr Met Ser Trp Arg Ser ser Ser Tyr 
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485 490 495 

Pro Leu Leu Phe Asp Arg ser Arg Asn Pro Lys Pro Ala Phe Tyr Gly 

500 505 510 

Val lie Glu Ala Val Glu Asp Trp Thr Gly Lys Ser Glu 
515 520 525 

<210> 59 
<211> 1104 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 59 

atgcttgcca gtagtgccgg tttggtagca tcccaactca agctgtccgc gttagctgca 60 

gctaaaaatg ctggattaaa agatgtatat aaggatcgct ttctgattgg tgcagcaatt 120 

aatacctcga ttgcgagcgg ccagcaacct gatattacag aaattatcaa gcgtgatttt 180 

tcgtcgttaa cacctgaaaa tgcaatgaag tgggaatctg tcaggactgc tgatggcgga 240 

tggaaatggg cagatgccga tcaattcgtt acgtttgcaa cagaacacaa aatacacgct 300 

gttggccaca cccttgcctg gcatagccag attcccgatt ccgtattcaa aaatgaaaaa 360 

ggcgaataca taaaatccac cgagctatca aaaaaaatgg aagaacatat cactacgatt 420 

gtaggtagat ataaaggcaa actcgatgcc tgggatgtag ttaatgaggc tgttggtgat 480 

gataatcaaa tgcgcaaaag ccattattac aatattctcg gcgaagattt tattgataag 540 

gcatttcacc ttgcgcatga ggtcgatccc aaagcgcatt taatgtataa cgactacaac 600 

attgaaaaag atggcaagcg tgaagctacc cttgaaatgt taaagcgttt acaaaaacgc 660 

ggtgtaccga ttcatgggct cggcatccag ggacatattg ccgttgatgg ccccagcatt 720 

gcggatattg aaaaaagtat tttggcttat gcggatttgg gtttgcgtgt acatttcacc 780 

gagttggata ttgatgtatt gccgcaaatc tggaacttac cggttgcaga aatttctaca 840 

cgcttcgaat acaaacctga gcgagatcct ttcaaaaatg gtttatcaaa agaaatgaac 900 

gataaactca gtgcacgcta tgaagaatta ttcacattat ttattaaaca caaagataaa 960 

attgatcgta ttactttgtg gggtgtcagc gatgatgcaa cctggctaaa tgatttcccc 1020 

atcaaaggca gaaccagtta tccattattg tttgatcgca agcatcaacc aaaagatgct 1080 

tattataaca ttctggcgtt gtga 1104 

<210> 60 

<211> 367 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(21) 

<400> 60 

Met Leu Ala Ser Ser Ala 

1 5 
Ala Leu Ala Ala Ala Lys 
20 

Arg phe Leu lie Gly Ala 

35 

Gin Pro Asp lie Thr Glu 
50 

Pro Glu Asn Ala Met Lys 
65 70 
Trp Lys Trp Ala Asp Ala 
85 

Lys lie His Ala Val Gly 
100 

Asp ser val Phe Lys Asn 
115 

Leu ser Lys Lys Met Glu 
130 

Lys Gly Lys Leu Asp Ala 
145 150 
Asp Asn Gin Met Arg Lys 
165 



Gly Leu Val Ala ser Gin Leu Lys Leu Ser 

10 15 
Asn Ala Gly Leu Lys Asp val Tyr Lys Asp 

25 30 
Ala lie Asn Thr ser lie Ala ser Gly Gin 

40 45 
lie lie Lys Arg Asp Phe ser ser Leu Thr 
55 60 
Trp Glu Ser Val Arg Thr Ala Asp Gly Gly 

75 80 
Asp Gin Phe Val Thr Phe Ala Thr Glu His 

90 95 
His Thr Leu Ala Trp His ser Gin lie Pro 

105 110 
Glu Lys Gly Glu Tyr lie Lys ser Thr Glu 

120 125 
Glu His lie Thr Thr lie Val Gly Arg Tyr 
135 140 
Trp Asp Val val Asn Glu Ala Val Gly Asp 
155 160 
Ser His Tyr Tyr Asn lie Leu Gly Glu Asp 
170 17S 
Paae 48 
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Phe He Asp Lys Ala Phe His Leu Ala His Glu Val Asp Pro Lys Ala 

180 185 190 

His Leu Met Tyr Asn Asp Tyr Asn He Glu Lys Asp Gly Lys Arg Glu 

195 200 205 

Ala Thr Leu Glu Met Leu Lys Arg Leu Gin Lys Arg Gly val Pro lie 

210 215 220 

His Gly Leu Gly lie Gin Gly His He Ala Val Asp Gly Pro Ser He 
225 230 235 240 

Ala Asp lie Glu Lys Ser lie Leu Ala Tyr Ala Asp Leu Gly Leu Arg 

245 250 255 

Val His Phe Thr Glu Leu Asp lie Asp val Leu Pro Gin lie Trp Asn 

260 265 270 

Leu Pro Val Ala Glu lie ser Thr Arg Phe Glu Tyr Lys Pro Glu Arg 

275 280 285 

Asp Pro Phe Lys Asn Gly Leu Ser Lys Glu Met Asn Asp Lys Leu Ser 

290 295 300 

Ala Arg Tyr Glu Glu Leu Phe Thr Leu Phe lie Lys His Lys Asp Lys 
305 310 315 320 

lie Asp Arg lie Thr Leu Trp Gly Val Ser Asp Asp Ala Thr Trp Leu 

325 330 335 

Asn Asp Phe Pro lie Lys Gly Arg Thr ser Tyr Pro Leu Leu Phe Asp 

340 345 350 

Arg Lys His Gin Pro Lys Asp Ala Tyr Tyr Asn lie Leu Ala Leu 
355 360 365 

<210> 61 
<211> 1041 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 61 

atgagaagaa gcatggaaag gctgcccaag ctccatgaag cttacggcaa tagtttcaag 60 

atcggcgctg ccgtgaatcc aattacgatg gtgacccaaa aggaattgtt gtcacaccac 120 

ttcaacagcg ttacggcaga aaatgaaatg aaattcgagc gattgcaccc atcggaagag 180 

gtgtatacat tcgagcaagc cgaccagatc gtatcgtttg ccaaatcgaa cggaatgtcg 240 

gtgagaggac ataccctcgt atggcataat cagacgccgg aatgggtgtt tcaagacagt 300 

tccggtggga cagccggccg cgagctgctg ctcgctcgga tgaaatcgca catcgatgag 360 

gtcgttggcc gttatcgcgg agatatctat gcttgggatg tcgtaaacga agccattgcc 420 

gacagtggaa gcgatctgct tcgttcctcc ccgtggcttg cgtcgatcgg ggaggatttt 480 

atcgccaagg ctttcgaata tgcgcacgaa gcagacccgc aagcgctgct gttttataac 540 

gattacaacg aatccgtgcc cgagaagcgg gagaagattt acacgctcct taaatcgtta 600 

aaggagcagg atgtgccgat tcacggcgtc gggcttcagg cccattggaa tttggagttt 660 

ccatcgcttg acgatatccg cagggcaatc gaaaggtatg caagccttgg catgatcttg 720 

catatcacgg agcttgacgt atccgtattc gcgcatgagg ataagcggac cgatctggcg 780 

gcgccgaccg aagaaatgct tgagcgccag gcggagcgtt acggtcaatt gttccgtctg 840 

cttaaagagt acagcggcag cgtcacttcc gtgaccttct ggggagcggc ggacgattat 900 

acctggctgg atcattttcc ggtaaggggc cgcaaaaatt ggccgttcgt cttcgacgag 960 

aaccatcttc cgaaggaatc ctattggaac ctgttgaagg aagccaatcc cgaaagaaca 1020 

ttccaagaga tacgttcgta a 1041 

<210> 62 
<211> 346 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 62 

Met Arg Arg ser Met Glu Arg Leu Pro Lys Leu His Glu Ala Tyr Gly 
1 5 10 15 

Asn ser Phe Lys lie Gly Ala Ala val Asn Pro lie Thr Met Val Thr 

20 25 30 

Gin Lys Glu Leu Leu ser His His Phe Asn ser Val Thr Ala Glu Asn 

35 40 45 

Glu Met Lys Phe Glu Arg Leu His Pro Ser Glu Glu Val Tyr Thr Phe 
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50 55 60 

Glu Gin Ala Asp Gin lie val Ser Phe Ala Lys Ser Asn Gly Met Ser 
65 70 75 80 

val Arg Gly His Thr Leu val Trp His Asn Gin Thr Pro Glu Trp Val 

85 90 95 

Phe Gin Asp Ser ser Gly Gly Thr Ala Gly Arg Glu Leu Leu Leu Ala 

100 105 110 

Arg Met Lys Ser His lie Asp Glu Val val Gly Arg Tyr Arg Gly Asp 

115 120 125 

lie Tyr Ala Trp Asp Val Val Asn Glu Ala He Ala Asp Ser Gly Ser 

130 135 140 

Asp Leu Leu Arg ser ser pro Trp Leu Ala ser lie Gly Glu Asp Phe 
145 150 155 160 

lie Ala Lys Ala Phe Glu Tyr Ala His Glu Ala Asp Pro Gin Ala Leu 

165 170 175 

Leu Phe Tyr Asn Asp Tyr Asn Glu Ser val Pro Glu Lys Arg Glu Lys 

180 185 190 

lie Tyr Thr Leu Leu Lys ser Leu Lys Glu Gin Asp Val Pro lie His 

195 200 205 

Gly val Gly Leu Gin Ala His Trp Asn Leu Glu Phe Pro ser Leu Asp 

210 215 220 

Asp lie Arg Arg Ala lie Glu Arg Tyr Ala Ser Leu Gly Met lie Leu 
225 ~ 230 235 240 

His lie Thr Glu Leu Asp Val Ser Val Phe Ala His Glu Asp Lys Arg 

245 250 255 

Thr Asp Leu Ala Ala Pro Thr Glu Glu Met Leu Glu Arg Gin Ala Glu 

260 265 270 

Arg Tyr Gly Gin Leu Phe Arg Leu Leu Lys Glu Tyr ser Gly ser val 

275 280 285 

Thr Ser Val Thr Phe Trp Gly Ala Ala Asp Asp Tyr Thr Trp Leu Asp 

290 295 300 

His Phe Pro Val Arg Gly Arg Lys Asn Trp Pro Phe Val Phe Asp Glu 
305 310 315 320 

Asn His Leu Pro Lys Glu Ser Tyr Trp Asn Leu Leu Lys Glu Ala Asn 

325 330 335 

Pro Glu Arg Thr Phe Gin Glu lie Arg ser 
340 345 

<210> 63 
<211> 1110 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 63 

atgaaacgaa ttttaattgg tttggcggct cttaccgctt ccgggctgtc ggcgcagaaa 60 

tccgacggta ctttaaaaaa agcatttcag gataaattct atatcgggac tgcgatgagt 120 

cttcctcaga ttgatgggac agataaaaga gcggtagcca ttatcagaaa tcagttcagt 180 

tctattgttg ctgaaaactg tatgaaatcg atgtttctgc aacctcagga aggaaagttc 240 

ttctttgatg acgctgataa atttgttgat ttcgggatga aaaacaatat gttcgtcatc 300 

ggacatacgc taatctggca ttcccagctt ccaaaatggt tttttacaga taaaaatgga 360 

aaagatgttt ctccggaagt attgaaacag cgcatgaaaa accacattac aaccgtagtt 420 

tcccgttaca aaggaaaagt aaaaggatgg gatgtggtga atgaagccat tcttgaagac 480 

ggaacctata gaaaaagtaa attttacgaa attctgggtg aagattttat tcctttggcg 540 

tttcagtatg cacaggaagc cgatcccaat gcagaattat attacaacga ttataatgaa 600 

tggtatccgg aaaaggtaaa agcagtcatt acaatggttg aaaagcttaa atcaagagga 660 

atccgtattg atggagtagg aatgcaggcc catgtcggaa tggatatccc ttccatcaat 720 

gaatatgaaa aagcaattct ggcgtattcc aatgccggag ttaaagttaa tattacggag 780 

ctggaaatta gtgcgctgcc ttctccgtgg ggaagctctg ccaatgtttc agataccgtt 840 

gcctatcaga aagaaatgaa tccttacacc aaagggcttc ccaatgaagt agaagcgaaa 900 

tgggaaaaac gttaccttga tttctttagc ttgtttttaa aacataaaga taaaataaga 960 

agggtgacct tatggggagt tactgataag cagtcctgga aaaacgattt tccggtaaaa 1020 

ggaagaacag attacccgtt gctgtttgac aggaaagatc aggagaaacc tgtagtacaa 1080 

aaaataataa aattggcaga gaaaaattaa 1110 

<210> 64 
<211> 369 

Page 50 



WO 03/106654 



PCT/US03/19153 



<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(20) 

<40Q> 64 

Met Lys Arg lie Leu lie Gly Leu Ala Ala Leu Thr Ala Ser Gly Leu 

15 10 15 

ser Ala Gin Lys ser Asp Gly Thr Leu Lys Lys Ala Phe Gin Asp Lys 

20 25 30 

Phe Tyr lie Gly Thr Ala Met Ser Leu pro Gin lie Asp Gly Thr Asp 

35 40 45 

Lys Arg Ala val Ala lie lie Arg Asn Gin Phe Ser Ser lie Val Ala 

50 55 60 

Glu Asn Cys Met Lys Ser Met Phe Leu Gin Pro Gin Glu Gly Lys Phe 
65 70 75 80 

Phe Phe Asp Asp Ala Asp Lys Phe val Asp Phe Gly Met Lys Asn Asn 

85 90 95 

Met Phe Val lie Gly His Thr Leu lie Trp His ser Gin Leu Pro Lys 

100 105 110 

Trp Phe Phe Thr Asp Lys Asn Gly Lys Asp Val Ser Pro Glu Val Leu 

115 120 125 

Lys Gin Arg Met Lys Asn His lie Thr Thr val Val ser Arg Tyr Lys 

130 135 140 

Gly Lys Val Lys Gly Trp Asp Val Val Asn Glu Ala lie Leu Glu Asp 
145 * 150 155 160 

Gly Thr Tyr Arg Lys Ser Lys Phe Tyr Glu lie Leu Gly Glu Asp Phe 

165 170 175 

lie Pro Leu Ala Phe Gin Tyr Ala Gin Glu Ala Asp Pro Asn Ala Glu 

180 185 190 

Leu Tyr Tyr Asn Asp Tyr Asn Glu Trp Tyr Pro Glu Lys val Lys Ala 

195 200 205 

Val lie Thr Met Val Glu Lys Leu Lys Ser Arg Gly lie Arg He Asp 

210 215 220 

Gly Val Gly Met Gin Ala His Val Gly Met Asp lie Pro ser lie Asn 
225 230 235 240 

Glu Tyr Glu Lys Ala lie Leu Ala Tyr ser Asn Ala Gly val Lys Val 

245 250 255 

Asn lie Thr Glu Leu Glu lie ser Ala Leu Pro Ser pro Trp Gly ser 

260 265 270 

Ser Ala Asn Val Ser Asp Thr Val Ala Tyr Gin Lys Glu Met Asn Pro 

275 280 285 

Tyr Thr Lys Gly Leu Pro Asn Glu Val Glu Ala Lys Trp Glu Lys Arg 

290 295 300 

Tyr Leu Asp Phe Phe ser Leu Phe Leu Lys His Lys Asp Lys lie Arg 
305 310 315 320 

Arg Val Thr Leu Trp Gly Val Thr Asp Lys Gin ser Trp Lys Asn Asp 

325 330 335 

Phe Pro Val Lys Gly Arg Thr Asp Tyr Pro Leu Leu Phe Asp Arg Lys 

340 345 350 

Asp Gin Glu Lys Pro Val Val Gin Lys lie lie Lys Leu Ala Glu Lys 
355 360 365 

Asn 

<210> 65 
<211> 1557 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 65 

atgaaaagaa tcggactgtt gctgctggct gtgatcatgc ttgtgggctg tgtatattcc 60 
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gcggcggcgg aggatacgct ggtttatgct tccacttttg tggccggaac ggacggatgg 120 

tacgcccgcg gagcgcagaa agtataccgc acaaccgagg agacactgcg gacggaaggc 180 

cggaccagcg actggcattc cccgggccgt gattttgacc tggtggaagg cggcgtctat 240 

gtcctgagcg tggaagtgtt ccaggacgaa gcggacaacg ccagcttcat gatttccatc 300 

gcccacagca aggacggtac ggaaacctat gaaaacctgg ctcgcggaac cgccaaacgc 360 

ggcgagtggg tcaccctgac cggaacatat accgccggca attttgaccg gaacgtcctg 420 

tatgtggaaa cgaccggatc gccggaactg agctatgaaa tccggaattt ccgggttgaa 480 

gcgccgaacg gagttccgga gccgaaggct acggagcccc cgatggtgat tgaggcggtg 540 

gagaacctcc cgggcctgaa gaacgcgtat gcgggaaaat ttgatttcgg cgcggcggtt 600 

ccgggatacg ctttcggcga tccgggcctg aaacagctga tgactgagca gttcagcatc 660 

ctgacgcccg aaaacgaact gaaaccggac gctgtgctgg acgtggcggc gagcaagcgg 720 

ctggcccagg aggatgaaac ggcggtggcg gttcattttg acggcgccat tccgctgctg 780 

aactttgccc gggacaacgg catcagggtg cacggacatg tgctgatctg gcacagccag 840 

acgccggaag cgttcttcca tgagggctat gacacctcca agcccctggt cagccgggaa 900 

gtgatgctgg gccggatgga aaactatatc cgcgaggtgc tgacctggac gaacgagaat 960 

tatccgggcg tgatcgtatc ctgggacgtg gtgaacgaag ccattgatga cggaacgaac 1020 

tggctgcgga attccaactg gtacaagacg gtgggcggcg actttgtgaa ccgggctttt 1080 

gaatttgccc gcatgtacgc ggcggacggc gtcctcctgt attacaatga ttacaatacc 1140 

gcctatccgg ccaaacggaa gggaatcatc aagctgctgg gccagctgat tgaggaaggc 1200 

aatattgacg gatacggctt ccagatgcat cacagcaccg gcgagccttc catggagatg 1260 

atcaccgctt cggtggagga aatcgccgcg ctgggaataa aactgcgggt cagcgagctg 1320 

gatgtgggca tgggcagcag catgacggaa gaagccctga tgaaacagaa ggacaaatac 1380 

aaggcggtca tggaactgat gctgcggttt gccgaccaga cggaagcggt gcaggtatgg 1440 

ggactgacgg acaatatgag ctggcggacc ggccagaatc cgctgctgtt tgaccggaac 1500 

cggaacccga agccggcctt cttcggcgtc ctggaagcgg cggaagaaag caaataa 1557 

<210> 66 
<211> 518 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(22) 

<400> 66 

Met Lys Arg He Gly Leu Leu Leu Leu Ala Val lie Met Leu Val Gly 

1 5 10 15 

cys val Tyr Ser Ala Ala Ala Glu Asp Thr Leu val Tyr Ala ser Thr 

20 25 30 

Phe val Ala Gly Thr Asp Gly Trp Tyr Ala Arg Gly Ala Gin Lys Val 

35 40 45 

Tyr Arg Thr Thr Glu Glu Thr Leu Arg Thr Glu Gly Arg Thr ser Asp 

50 55 60 

Trp His ser Pro Gly Arg Asp Phe Asp Leu val Glu Gly Gly val Tyr 
65 70 75 80 

Val Leu ser val Glu val Phe Gin Asp Glu Ala Asp Asn Ala ser Phe 

85 90 95 

Met lie Ser lie Ala His Ser Lys Asp Gly Thr Glu Thr Tyr Glu Asn 

100 105 110 

Leu Ala Arg Gly Thr Ala Lys Arg Gly Glu Trp Val Thr Leu Thr Gly 

115 120 125 

Thr Tyr Thr Ala Gly Asn Phe Asp Arg Asn val Leu Tyr val Glu Thr 

130 135 140 

Thr Gly ser Pro Glu Leu ser Tyr Glu He Arg Asn Phe Arg Val Glu 
145 150 155 160 

Ala Pro Asn Gly Val Pro Glu Pro Lys Ala Thr Glu Pro Pro Met Val 

165 170 175 

lie Glu Ala val Glu Asn Leu Pro Gly Leu Lys Asn Ala Tyr Ala Gly 

180 185 190 

Lys Phe Asp Phe Gly Ala Ala Val Pro Gly Tyr Ala Phe Gly Asp Pro 

195 200 205 

Gly Leu Lys Gin Leu Met Thr Glu Gin Phe ser lie Leu Thr Pro Glu 

210 215 220 

Asn Glu Leu Lys Pro Asp Ala Val Leu Asp val Ala Ala ser Lys Arg 
225 230 235 240 

Leu Ala Gin Glu Asp Glu Thr Ala val Ala Val His Phe Asp Gly Ala 
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245 250 255 

He Pro Leu Leu Asn Phe Ala Arg Asp Asn Gly He Arg Val His Gly 

260 265 270 

His Val Leu He Trp His ser Gin Thr pro Glu Ala Phe Phe His Glu 

275 280 285 

Gly Tyr Asp Thr ser Lys Pro Leu Val Ser Arg Glu val Met Leu Gly 

290 295 300 

Arg Met Glu Asn Tyr lie Arg Glu Val Leu Thr Trp Thr Asn Glu Asn 
305 310 ~ 315 320 

Tyr Pro Gly val He Val Ser Trp Asp Val val Asn Glu Ala lie Asp 

325 330 335 

Asp Gly Thr Asn Trp Leu Arg Asn ser Asn Trp Tyr Lys Thr Val Gly 

340 345 350 

Gly Asp Phe val Asn Arg Ala Phe Glu Phe Ala Arg Met Tyr Ala Ala 

355 360 365 

Asp Gly Val Leu Leu Tyr Tyr Asn Asp Tyr Asn Thr Ala Tyr Pro Ala 

370 375 380 

Lys Arg Lys Gly lie He Lys Leu Leu Gly Gin Leu lie Glu Glu Gly 
385 390 395 400 

Asn lie Asp Gly Tyr Gly Phe Gin Met His His ser Thr Gly Glu Pro 

405 410 415 

ser Met Glu Met lie Thr Ala Ser Val Glu Glu lie Ala Ala Leu Gly 

420 425 430 

lie Lys Leu Arg Val ser Glu Leu Asp Val Gly Met Gly Ser Ser Met 

435 440 445 

Thr Glu Glu Ala Leu Met Lys Gin Lys Asp Lys Tyr Lys Ala val Met 

450 455 460 

Glu Leu Met Leu Arg Phe Ala Asp Gin Thr Glu Ala val Gin Val Trp 
465 470 475 480 

Gly Leu Thr Asp Asn Met ser Trp Arg Thr Gly Gin Asn Pro Leu Leu 

485 490 495 

Phe Asp Arg Asn Arg Asn Pro Lys Pro Ala Phe Phe Gly val Leu Glu 

500 505 510 

Ala Ala Glu Glu ser Lys 
515 

<210> 67 
<211> 1224 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 67 

atgcggaacg tcgtgcgtaa accattgaca atcggactcg ctttaacact attattgccc 60 

atgggaatga cggcaacatc agcgaagaat gcagattcct atgcgaaaaa acctcacatc 120 

agcgcattga atgccccaca attggatcaa cgctacaaaa acgagttcac gattggtgcg 180 

gcagtagaac cttatcaact acaaaatgaa aaagacgtac aaatgctaaa gcgccacttc 240 

aacagcattg ttgccgagaa cgtaatgaaa ccgatcagca ttcaacctga ggaaggaaaa 300 

ttcaattttg aacaagcgga tcgaattgtg aagttcgcta aggcaaatgg catggatatt 360 

cgcttccata cactcgtttg gcacagccaa gtacctcaat ggttctttct tgacaaggaa 420 

ggcaagccaa tggttaatga aacagatcca gtgaaacgtg aacaaaataa acaactgctg 480 

ttaaaacgac ttgaaactca tattaaaacg atcgtcgagc ggtacaaaga tgacattaag 540 

tactgggacg ttgtaaatga ggttgtgggg gacgacggaa aactgcgcaa ctctccatgg 600 

tatcaaatcg ccggcatcga ttatattaaa gtggcattcc aaacagcgag aaaatatggc 660 

ggcaacaaga ttaaacttta tatcaatgat tacaataccg aagtggaacc aaagcgaagc 720 

gctctttata acttggtgaa gcaattaaaa gaagagggcg ttcctattga cggcatcggc 780 

catcaatccc acattcaaat cggctggcct tctgaagcag aaatcgagaa aacgattaac 840 

atgttcgccg ctctcggctt agacaaccaa atcactgagc ttgatgtgag catgtacggt 900 

tggccgccgc gcgcttaccc gacgtatgac gccattccaa aacaaaagtt tttggatcag 960 

gcagcgcgct atgatcgttt gttcaaactg tatgaaaagt tgagcgataa aattagcaac 1020 

gtcaccttct ggggcatcgc cgacaatcat acgtggctcg acagccgtgc ggatgtgtac 1080 

tatgacgcca acgggaatgt tgtggttgac ccgaacgctc cgtacgcaaa agtggaaaaa 1140 

gggaaaggaa aagatgcgcc gttcgttttt ggaccggatt acaaagtcaa acccgcatat 1200 

tgggctatta tcgaccacaa atag 1224 

<210> 68 
<211> 407 
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<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(28) 

<400> 68 

Met Arg Asn Val val Arg Lys Pro Leu Thr lie Gly Leu Ala Leu Thr 

15 10 15 

Leu Leu Leu Pro Met Gly Met Thr Ala Thr Ser Ala Lys Asn Ala Asp 

20 25 30 

ser Tyr Ala Lys Lys Pro His He Ser Ala Leu Asn Ala Pro Gin Leu 

35 40 45 

Asp Gin Arg Tyr Lys Asn Glu Phe Thr lie Gly Ala Ala val Glu Pro 

50 55 60 

Tyr Gin Leu Gin Asn Glu Lys Asp val Gin Met Leu Lys Arg His Phe 
65 70 75 ~ 80 

Asn ser He val Ala Glu Asn Val Met Lys Pro lie ser lie Gin Pro 

85 90 95 

Glu Glu Gly Lys Phe Asn Phe Glu Gin Ala Asp Arg lie Val Lys Phe 

100 105 110 

Ala Lys Ala Asn Gly Met Asp lie Arg Phe His Thr Leu val Trp His 

115 120 ~ 125 

ser Gin val Pro Gin Trp Phe Phe Leu Asp Lys Glu Gly Lys Pro Met 

130 135 140 

Val Asn Glu Thr Asp pro Val Lys Arg Glu Gin Asn Lys Gin Leu Leu 
145 150 155 160 

Leu Lys Arg Leu Glu Thr His lie Lys Thr lie Val Glu Arg Tyr Lys 

165 170 175 

Asp Asp lie Lys Tyr Trp Asp Val Val Asn Glu Val val Gly Asp Asp 

180 185 190 

Gly Lys Leu Arg Asn ser Pro Trp Tyr Gin lie Ala Gly lie Asp Tyr 

195 200 205 

lie Lys val Ala Phe Gin Thr Ala Arg Lys Tyr Gly Gly Asn Lys lie 

210 215 220 

Lys Leu Tyr lie Asn Asp Tyr Asn Thr Glu Val Glu Pro Lys Arg Ser 
225 230 235 240 

Ala Leu Tyr Asn Leu Val Lys Gin Leu Lys Glu Glu Gly Val Pro lie 

245 250 255 

Asp Gly lie Gly His Gin Ser His lie Gin lie Gly Trp Pro Ser Glu 

260 265 270 

Ala Glu lie Glu Lys Thr lie Asn Met phe Ala Ala Leu Gly Leu Asp 

275 280 285 

Asn Gin lie Thr Glu Leu Asp Val ser Met Tyr Gly Trp Pro Pro Arg 

290 295 300 

Ala Tyr Pro Thr Tyr Asp Ala lie pro Lys Gin Lys Phe Leu Asp Gin 
305 310 315 320 

Ala Ala Arg Tyr Asp Arg Leu Phe Lys Leu Tyr Glu Lys Leu Ser Asp 

325 330 335 

Lys lie Ser Asn val Thr Phe Trp Gly lie Ala Asp Asn His Thr Trp 

340 345 350 

Leu Asp ser Arg Ala Asp Val Tyr Tyr Asp Ala Asn Gly Asn Val Val 

355 360 365 

val Asp Pro Asn Ala Pro Tyr Ala Lys val Glu Lys Gly Lys Gly Lys 

370 375 380 

Asp Ala Pro Phe Val Phe Gly Pro Asp Tyr Lys Val Lys Pro Ala Tyr 
385 390 395 400 

Trp Ala lie He Asp His Lys 
405 

<210> 69 
<211> 1596 
<212> DNA 
<213> Unknown 

<220> 
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<223> Obtained from an environmental sample 
<400> 69 

atggcgatgc atagatttaa gcaattaggg gccatcctac ttgtcctatg gttttgtgca 60 

ttgccagtgc aggcgcaggc ttggcgtgcg gccgcagagc agcgtattga acagtaccgt 120 

aaggggccac tgcgggttca ggtgaaggat cctgaaggac ggcccgtacc gaatgcccaa 180 

gtgcacgttc gcatgacgcg tcacgctttt ggatttggta cggctgtcag ctttggcctg 240 

gtcgtggggt cgggatacaa ccccacctat cgggccaagc tagaagacct gacgggcgac 300 

ggccgcacat tcaacatggc tacgccagag aatgaattga agtggcctgc gtgggagtcg 360 

gaatggccca tttcgaatcg tcgaaagatc gacgtcatca actggctgcg cgcaaaaggc 420 

tacagcattc gaggacacaa cctgctatgg cctgactggc aatggatgcc ccgtgatatt 480 

gagcaaaacc gcaacaatcc acagtacatc tacgatcgcg ttcgcaatca cattgcggcg 540 

ttggctgggc atcgggacat tcggggcaaa ctgcgggact gggatgttct taacgaacca 600 

gcccacctga ccgcattgcg cgatgtgttt aacggttggg gctcatatga gcgtggggaa 660 

gacttctatg tggatgtctt taggtgggcc aaggcagcag actcgaccgc ccgtctatac 720 

atcaacgagt acaacattat caacaactac gccaacgagc agcctacgcg caactattac 780 

aagtggatca ttgcacgcct aatctcaaaa ggagcgccta tcgaagggat cggcattcag 840 

gggcatattt cggcaccact gccaagcatg agtgaggtca aggcagccct agacgaaatg 900 

gcagtttttg gattgccttt ggccatcaca gaatacgacg ttaccggcgt ttcggaagaa 960 

gtcgaagcca actttatgcg ggactttttg accatggtct ttagtcatcc cgctgtggag 1020 

agcttcgtca tgtggggttt ctggagcgga gcacactggc gtgacaatgc gccgctgttt 1080 

cgggccgact ggagtctcaa gccttcggga caggtgttcc ttgatctggt ctttcggcgc 1140 

tggtggaccg atactacggg ggtaaccggt ccagatggca gctggtctgt acgcggattt 1200 

ttaggggatt acgttgtgga agtgcaggtg ggggaggttt cagtgaccaa gtccctgcgc 1260 

ctcgaaagcc cgcaggatac aaccacgcta gaggtggtgg tcagtagcgt taaggtgggt 1320 

gaaaagccta cagaagacgt gttgcgcgtg caagggtttg gaccagaccc ctttgtcgaa 1380 

ggaacggcgc tgcgctactg gttagggcgg ccggccgatg ttgaactggc agtgtatgat 1440 

gtgctgggcc gacaggtcta cgccgtgcaa aagcatcgcg tagctggttg gcatactgaa 1500 

tgggtcgagg cttcccactg gcctgcagga ctttatctgt accgactcca agcaggtgat 1560 

ctgttgcaca cgggtagaat ggtcaagatc caataa 1596 

<210> 70 
<211> 531 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> CD... (25) 

<400> 70 

Met Ala Met His Arg Phe Lys Gin Leu Gly Ala lie Leu Leu val Leu 

15 10 15 

Trp Phe cys Ala Leu Pro Val Gin Ala Gin Ala Trp Arg Ala Ala Ala 

20 25 30 

Glu Gin Arg lie Glu Gin Tyr Arg Lys Gly Pro Leu Arg Val Gin Val 

35 40 45 

Lys Asp Pro Glu Gly Arg Pro Val Pro Asn Ala Gin Val His Val Arg 

50 55 60 

Met Thr Arg His Ala Phe Gly Phe Gly Thr Ala val ser Phe Gly Leu 
65 70 75 80 

val val Gly Ser Gly Tyr Asn Pro Thr Tyr Arg Ala Lys Leu Glu Asp 

85 90 95 

Leu Thr Gly Asp Gly Arg Thr Phe Asn Met Ala Thr Pro Glu Asn Glu 

100 105 110 

Leu Lys Trp Pro Ala Trp Glu Ser Glu Trp Pro He Ser Asn Arg Arg 

115 120 125 

Lys lie Asp val lie Asn Trp Leu Arg Ala Lys Gly Tyr ser lie Arg 

130 135 140 

Gly His Asn Leu Leu Trp pro Asp Trp Gin Trp Met Pro Arg Asp lie 
145 150 155 160 

Glu Gin Asn Arg Asn Asn pro Gin Tyr lie Tyr Asp Arg val Arg Asn 

165 170 175 

His lie Ala Ala Leu Ala Gly His Arg Asp lie Arg Gly Lys Leu Arg 

180 185 190 

Asp Trp Asp val Leu Asn Glu Pro Ala His Leu Thr Ala Leu Arg Asp 
195 200 205 
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Val Phe Asn Gly Trp Gly ser Tyr Glu Arg Gly Glu Asp Phe Tyr val 

210 21S 220 

Asp Val Phe Arg Trp Ala Lys Ala Ala Asp ser Thr Ala Arg Leu Tyr 
225 230 235 ~ 240 

lie Asn Glu Tyr Asn lie lie Asn Asn Tyr Ala Asn Glu Gin Pro Thr 

245 250 255 

Arg Asn Tyr Tyr Lys Trp lie He Ala Arg Leu lie Ser Lys Gly Ala 

260 265 270 

Pro lie Glu Gly lie Gly lie Gin Gly His lie Ser Ala Pro Leu Pro 

275 280 285 

ser Met ser Glu val Lys Ala Ala Leu Asp Glu Met Ala val Phe Gly 

290 295 300 

Leu Pro Leu Ala lie Thr Glu Tyr Asp Val Thr Gly Val Ser Glu Glu 
305 310 315 320 

Val Glu Ala Asn Phe Met Arg Asp Phe Leu Thr Met Val Phe ser His 

325 330 335 

Pro Ala Val Glu Ser Phe Val Met Trp Gly Phe Trp Ser Gly Ala His 

340 345 350 

Trp Arg Asp Asn Ala Pro Leu Phe Arg Ala Asp Trp ser Leu Lys Pro 

355 360 365 

Ser Gly Gin Val Phe Leu Asp Leu Val Phe Arg Arg Trp Trp Thr Asp 

370 375 380 

Thr Thr Gly Val Thr Gly Pro Asp Gly Ser Trp Ser Val Arg Gly Phe 
385 390 395 400 

Leu Gly Asp Tyr val val Glu val Gin Val Gly Glu val ser val Thr 

405 410 415 

Lys Ser Leu Arg Leu Glu Ser Pro Gin Asp Thr Thr Thr Leu Glu Val 

420 425 430 

Val val ser Ser val Lys Val Gly Glu Lys Pro Thr Glu Asp Val Leu 

435 440 445 

Arg Val Gin Gly Phe Gly Pro Asp Pro Phe Val Glu Gly Thr Ala Leu 

450 455 460 

Arg Tyr Trp Leu Gly Arg pro Ala Asp val Glu Leu Ala val Tyr Asp 
465 475 475 48b 

Val Leu Gly Arg Gin val Tyr Ala val Gin Lys His Arg val Ala Gly 

485 490 495 

Trp His Thr Glu Trp Val Glu Ala Ser His Trp Pro Ala Gly Leu Tyr 

500 505 510 

Leu Tyr Arg Leu Gin Ala Gly Asp Leu Leu His Thr Gly Arg Met Val 

515 520 525 

Lys lie Gin 
530 

<210> 71 
<211> 1269 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 71 

atgatttcca tcgcccattc agtgaacggg gcggagacct atgagaacct ggcgcacgga 60 

actgccaaaa aaggcgaatg gactacgctg aaggggacat acaccgccgg cgcctatcag 120 

cgcaacgtgc tctatgtgga aacggtttct gaaggcaccc ttgactttga gatccgtaat 180 

tttgtcctga cggctccgaa cggactaccg gagcccaagc cgaccgagcc tccgatggtc 240 

atcgaggaag ccgagaacgt gcccagtctc aaagagattt atgcagacaa attcgatttc 300 

ggctccgccg cgccccagat ggtattccgt gaccccaaat ggctcaacct gatgaaggaa 360 

cagttcagca ttctgacgcc ggaaaacgaa atgaaaccgg attccgttct ggatgtgggc 420 

gcgagcaaag cgctggtgaa ggaaaccggt gatgagaccg ccgtcgccgt tcatttcgac 480 

gctgccaaag cgctgctgaa ttttgccaag agcaacggga tcaaggttca cggccatgtg 540 

ctgatctggc.acagccagac gccggaagct ttcttccatc agggatatga ttccaagaag 600 

cctttcgtta cacgggaagt gatgctgggc cgaatggaaa attacattaa gggtgttttt 660 

gaatacctgg atgaaaatta tcccggcgtc gttgtctcct gggacgtgct gaatgaggcg 720 

attgacgacg gaagcaactg gctgcggaac agcaactgga gaaagattgt cggcgaagac 780 

tatccgaacc gggcatatga atatgcgcgc aaatatgcgc cggaaggtac gctgctgtat 840 

tacaacgatt acaatacgtc gattcccggg aaactgaacg gcattgtgaa actgctgaac 900 

agtctgattc cggaaggaaa tatcgacggt tacggcttcc agatgcacca tggcgtcggc 960 

ttcccgtcca ttgatatgat ccagactgca gtggaacgga ttgccgcgct gaatatccgc 1020 
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cttcgcgtca gcgaactgga tgtcacggtg gacaacaaca cggaagcgtc cttcaacaaa 1080 

caggcaaagt attatgccga agtcatgaag attctgattg ctcacagcga ccagtttgag 1140 

gctgtgcagg tctgggggct gacagacctg atgagctggc gcggcagtca gttcccgctg 1200 

ctgtttgacg gggcaggcaa tccgaaaccg gcgttctggg ccgtcgcgga tccggattcc 1260 

gtgaaataa ~ 1269 

<210> 72 

<211> 422 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 72 

Met lie Ser lie Ala His Ser Val Asn Gly Ala Glu Thr Tyr Glu Asn 
1,5 10 15 

Leu Ala His Gly Thr Ala Lys Lys Gly Glu Trp Thr Thr Leu Lys Gly 

20 25 30 

Thr Tyr Thr Ala Gly Ala Tyr Gin Arg Asn val Leu Tyr val Glu Thr 

35 40 45 

Val Ser Glu Gly Thr Leu Asp Phe Glu lie Arg Asn Phe val Leu Thr 

50 55 60 

Ala Pro Asn Gly Leu Pro Glu Pro Lys Pro Thr Glu Pro Pro Met val 
65 70 75 80 

He Glu Glu Ala Glu Asn val Pro Ser Leu Lys Glu lie Tyr Ala Asp 

85 90 95 

Lys Phe Asp Phe Gly Ser Ala Ala Pro Gin Met Val Phe Arg Asp Pro 

100 105 110 

Lys Trp Leu Asn Leu Met Lys Glu Gin Phe ser lie Leu Thr pro Glu 

, 115 120 125 

Asn Glu Met Lys Pro Asp Ser Val Leu Asp val Gly Ala Ser Lys Ala 

130 135 140 

Leu val Lys Glu Thr Gly Asp Glu Thr Ala val Ala val His Phe Asp 
145 150 155 160 

Ala Ala Lys Ala Leu Leu Asn Phe Ala Lys ser Asn Gly lie Lys val 

. , 165 170 175 

His Gly His Val Leu lie Trp His ser Gin Thr Pro Glu Ala Phe Phe 

, , 180 185 190 

His Gin Gly Tyr Asp ser Lys Lys Pro Phe val Thr Arg Glu val Met 

195 200 205 

Leu Gly Arg Met Glu Asn Tyr lie Lys Gly val Phe Glu Tyr Leu Asp 
, 210 215 220 

Glu Asn Tyr Pro Gly val val Val Ser Trp Asp val Leu Asn Glu Ala 
225 230 235 240 

lie Asp Asp Gly Ser Asn Trp Leu Arg Asn ser Asn Trp Arg Lys lie 

245 250 255 

val Gly Glu Asp Tyr Pro Asn Arg Ala Tyr Glu Tyr Ala Arg Lys Tyr 

, 260 265 270 

Ala Pro Glu Gly Thr Leu Leu Tyr Tyr Asn Asp Tyr Asn Thr Ser lie 

_ 275 280 285 

Pro Gly Lys Leu Asn Gly lie Val Lys Leu Leu Asn Ser Leu lie Pro 
, 290 295 300 

Glu Gly Asn lie Asp Gly Tyr Gly Phe Gin Met His His Gly val Gly 
305 310 315 320 

Phe pro ser lie Asp Met He Gin Thr Ala Val Glu Arg lie Ala Ala 

325 330 335 

Leu Asn lie Arg Leu Arg val ser Glu Leu Asp Val Thr Val Asp Asn 

, 340 345 350 

Asn Thr Glu Ala Ser Phe Asn Lys Gin Ala Lys Tyr Tyr Ala Glu val 

355 360 365 

Met Lys lie Leu He Ala His Ser Asp Gin Phe Glu Ala val Gin Val 

370 375 380 

Trp Gly Leu Thr Asp Leu Met Ser Trp Arg Gly Ser Gin Phe Pro Leu 
385 390 395 400 

Leu phe Asp Gly Ala Gly Asn Pro Lys pro Ala Phe Trp Ala Val Ala 

405 410 415 

Asp pro Asp Ser val Lys 
420 
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<210> 73 
<211> 4455 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 73 

atgcagaaaa tgagaagaaa attgaaaaga attatgttat tacttctggc agctatgttg 60 

ataatcccgt caggctggat tacacaggct tcagcagcgg aaacaaacaa agatatacct 120 

gttctactgt accatcgaat tgttgataat cctactaatc aatggacgga taccagcgtt 180 

gaaacgttta aacagactat gcaatatcta aatgatagcg gttacaacac cttgtcagcc 240 

gaacaatatg taaagatcat ggatggaacg gcaacggcgc ctgaaaaacc gattctatta 300 

acgtttgacg atggtactcc agaatttatc accaatgctc ttccagtatt aaagcaatat 360 

aacatgaaag ctgttctgtt tattgtcagt gactggatag gcggcggctt cagcatgtca 420 

aaagaacagc tgcaaagttt ggctaatgaa ccatctttaa gcctcgaaaa tcatacgaaa 480 

acccatgacg gtactatttg gggaacaaat ggcggtgtac gtagtacgat aacgaaagaa 540 

caagctgagg accaaattat atcagcgaat acttatctta aaagtattac aggtaaagac 600 

ccagtcctaa tggcataccc ttatggcagc tataatgata ttgcaaaact agtaaaccaa 660 

gaaaatggta ttaagtacgc atttaaagtg ggatacccta atgaagataa ttatgctatg 720 

ggccgtcact atgtaacaaa tcaaagtgtg gctcaaattg cccaaatgat tggcggccct 780 

gtgccagaac caactccaga accaggaaac cagacagaaa ccgtctatca agaaaccttt 840 

gccagtgata ttggtgtagc agttcaagcg ggtaacccac aagtaaccca cgtttctggt 900 

atggtttttg caggcaatga cgatggaaaa gccatctctg ttagcggcag gacgaacaac 960 

tgggacggcg tcgatatccc attcaacaat gtcggtatgg aaaacggcaa aacttatacg 1020 

attacagtta ctggttatgt tgacgaaaat gcaactgttc cttctggcgc acaagcttta 1080 

ctgcagaatg tagacagcta taacggtttg tatgttgccg cagattatgc agcgggacag 1140 

gcttttactt taacgggtca gtataccgtg gatactagta aagatagagc cctacgtatc 1200 

caatcaaatg atgctgggaa aactgttccg ttttacattg gaaacatctt gattacaacg 1260 

aaaaaaacga ctgcgcctga aacagataga gtggtatttc acgaaacatt tggaaatggt 1320 

gttggtgttg ctacacaagc gggaagtgcg aaattgactc ctgtttctga gcttgttttt 1380 

gaaggcaata gcgatggaaa agcaatttct gttaatggca gatccaataa ctgggacgga 1440 

gttgatatac cgttcagcag tgtcagcatg cagaacggca aagcctatac cattacagtc 1500 

actggttttg tttatagcag tgtgagtgtt cctgaaggtg cacaagcttt gcttcagaat 1560 

gtagacagct ataatggctt gtatgcagca gcagatgtta aggcaggtca aacatttact 1620 

ctaacgggtc aatataccgt tgatacgagc aaagatagag cactacgaat tcaatcaaat 1680 

gatgcaggga aaaccgttcc cttctatatc ggagatattc tcattaccga gaaggcagcc 1740 

tctggtggtg gcggggacga tggaagacta cctgccgaac catttacagc aattaatttt 1800 

gaagaccaaa atatgggtgg tttcgaggga agagctggta ccgaaacact aacagtaacc 1860 

aatgaagcca atcatactga tggcggttcc tatgctttga aggttgaagg cagatcacaa 1920 

gcttggcatg gaccagcatt acacgtagag aaatatgttg acaaggattc ggaatataaa 1980 

atttctgcct gggtgaagct gatttcacca gcaacttcac agcttcagct ttctacacag 2040 

gtcggcaatg gcggaactgc tagttacaat aatcttcaag gaaaaactat cagcactgaa 2100 

gatggctggg ttaaacttga gggaacgtac cgttatagca gtgtaggcga tgagttttta 2160 

accatttatg tagagagctc gaataatagc acagcctcct tttatatcga tgatattact 2220 

tttgaatcga ctggttcggg tccgattgaa gttgaggatt tgacaccgat aaaagatgtt 2280 

tatcaagacg atttcttaat tggaaacgct gtctcagctt ctgatcttga aggcaataga 2340 

cttaagcttc tcaacatgca tcacaatgtt gtcacagcag agaatgcaat gaagccagat 2400 

caagcgtata atgcggaaaa acaatttgac tttactgatg aaaatgcgct tgtcgacaag 2460 

gttttggatc agggattgca gctgcatggt cacgtgcttg tatggcacca gcagacgcca 2520 

gaatggttat ttacagctga aaacggtgcc cctttgagcc gtgaggcagc actagcaaat 2580 

ttaaggaccc atgttaaaac agtcgtagaa aattacggta acaaggtaat ttcatgggac 2640 

gtggtaaacg aagcaatcat cgataacccg ccgaacccaa cggattggaa ggcatcactt 2700 

cgtaaatctg gctggtacaa atcgattgga ccagacttcg tagaacaatc cttccttgct 2760 

gcaaaagagg tactgaatga aaaaggcttg aatatcaagc tatattacaa tgattacaat 2820 

gatgataatc agagcaaagc cgaggccatt tatcagatgg tgaaagatat caatgaaaag 2880 

tatgctaaag aacatgatgg ggatcttctc attgacggaa ttggaatgca agcgcactac 2940 

aataaaaaca ctaatcctga aaatgttaaa ctctccctag agaagtttat tacattgggt 3000 

gtagaagtca gtgtgactga acttgacatt accgctggaa ccaataatgt acttactgag 3060 

aaggaagcaa ttgcacaggg ttatttatac gcacaattgt tcaagattta caaagaacac 3120 

gcagagcata tctcacgggt aactttctgg ggactaaatg atgcaacgag ctggagagct 3180 

gcacagagtc cattgttgtt tgataaagat ttgcaagcaa aaccagctta ctatgctgtt 3240 

atcgatccag acacatttac tgtagaaaat caacctgagg taagagaggc taatcaagga 3300 

agtgctgttt ccggcacacc agtgattgat ggaactgtag atggtgtttg gagcaatgca 3360 

acggaactgc cgattaatcg cttccaaatg gcttggcagg gagcaaacgg ggtatccaag 3420 

gtcctctggg ataatgaaaa cctgtatgtt ttaattcaag taagtgactc acagctcgac 3480 

aaatcgagtc caaatccatg ggaacaggat tccattgaag tctttgtaga tgagaataat 3540 
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gcaaagacat cttccttcga agatggtgat ggacaatatc gagtaaactt tgacaatgaa 3600 

acatccttta accctgtcag agttggagaa ggtttcgaat ctgcaaccaa agcatcaggt 3660 

aatggctata ccgttgaagt aaagattccg ttcaaaacca ttacaccaga taacaatacg 3720 

aaaatcggtt ttgatgttca gattaatgac ggtaaagatg gtgctcgtca aagtgctgca 3780 

acatggaacg atttaactgg tctgggatat caggacactt ctgtgttcgg cgtcctgaca 3840 

cttatgaaga ctgacaccac cgcgcctgtt acaaccgata acggaccaga agattgggtt 3900 

aataaagatg taacgattgc tttcagtgca aatgataatg acactggtgt ggcggcaacc 3960 

tattatagta ttgataatgg ggtcgtacaa aacggtaatt cagttactat ttcggaagag 4020 

ggtgtccaca ttctaacata ttggagtgta gacaaagctg gtaatgtcga gcaggttcat 4080 

acaaaaacaa ttaaactaga taagaccgga ccaatattag atattaaact cgacaaaaca 4140 

acattatcac cagttaatca taagatggtc ccaatatcgg cggctattag tgcatctgat 4200 

gccgattcag gaattcattc agtagtgtta acatcaatta ctagcaatga atctatccaa 4260 

cctgatgata ttcagaatgc caactataat aaacctatta caggtactac ggattccttt 4320 

aaacttcgtg cagaaagatt agcaaacggt aatggccgtg tttacaccat tacttatacg 4380 

gccacagata aagctggtaa tgtgacaaca aaaagtgttg aagtttccgt tccacgcgac 4440 

aattctaaaa aataa 4455 

<210> 74 
<211> 1484 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(21) 

<400> 74 

Met Gin Lys Met Arg Arg Lys Leu Lys Arg lie Met Leu Leu Leu Leu 

1 5 10 15 

Ala Ala Met Leu lie lie Pro Ser Gly Trp lie Thr Gin Ala ser Ala 

20 25 30 

Ala Glu Thr Asn Lys Asp He Pro Val Leu Leu Tyr His Arg lie Val 

35 40 45 

Asp Asn Pro Thr Asn Gin Trp Thr Asp Thr ser Val Glu Thr Phe Lys 

50 55 60 

Gin Thr Met Gin Tyr Leu Asn Asp ser Gly Tyr Asn Thr Leu ser Ala 
65 70 75 80 

Glu Gin Tyr Val Lys lie Met Asp Gly Thr Ala Thr Ala Pro Glu Lys 

85 90 95 

Pro lie Leu Leu Thr Phe Asp Asp Gly Thr Pro Glu Phe lie Thr Asn 

100 105 110 

Ala Leu pro val Leu Lys Gin Tyr Asn Met Lys Ala Val Leu Phe lie 

115 120 125 

Val ser Asp Trp lie Gly Gly Gly Phe Ser Met Ser Lys Glu Gin Leu 

130 135 140 

Gin Ser Leu Ala Asn Glu Pro Ser Leu Ser Leu Glu Asn His Thr Lys 
145 150 155 160 

Thr His Asp Gly Thr He Trp Gly Thr Asn Gly Gly Val Arg ser Thr 

165 170 175 

lie Thr Lys Glu Gin Ala Glu Asp Gin lie lie Ser Ala Asn Thr Tyr 

180 185 190 

Leu Lys Ser lie Thr Gly Lys Asp Pro Val Leu Met Ala Tyr Pro Tyr 

195 200 205 

Gly Ser Tyr Asn Asp lie Ala Lys Leu Val Asn Gin Glu Asn Gly lie 

210 215 220 

Lys Tyr Ala Phe Lys val Gly Tyr Pro Asn Glu Asp Asn Tyr Ala Met 
225 230 235 240 

Gly Arg His Tyr val Thr Asn Gin Ser val Ala Gin lie Ala Gin Met 

245 250 255 

lie Gly Gly Pro Val Pro Glu Pro Thr Pro Glu Pro Gly Asn Gin Thr 

260 265 270 

Glu Thr Val Tyr Gin Glu Thr Phe Ala ser Asp He Gly val Ala Val 

275 280 285 

Gin Ala Gly Asn Pro Gin Val Thr His val ser Gly Met Val Phe Ala 

290 295 300 

Gly Asn Asp Asp Gly Lys Ala lie Ser val ser Gly Arg Thr Asn Asn 
305 310 315 " 320 
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Trp Asp Gly val Asp lie Pro Phe Asn Asn val Gly Met Glu Asn Gly 

325 330 335 

Lys Thr Tyr Thr lie Thr val Thr Gly Tyr val Asp Glu Asn Ala Thr 

340 345 350 

Val Pro ser Gly Ala Gin Ala Leu Leu Gin Asn Val Asp ser Tyr Asn 

355 360 365 

Gly Leu Tyr Val Ala Ala Asp Tyr Ala Ala Gly Gin Ala Phe Thr Leu 

370 375 380 

Thr Gly Gin Tyr Thr val Asp Thr Ser Lys Asp Arg Ala Leu Arg lie 
385 390 395 400 

Gin Ser Asn Asp Ala Gly Lys Thr Val Pro Phe Tyr He Gly Asn lie 

405 410 415 

Leu lie Thr Thr Lys Lys Thr Thr Ala Pro Glu Thr Asp Arg Val Val 

420 425 430 

Phe His Glu Thr Phe Gly Asn Gly Val Gly Val Ala Thr Gin Ala Gly 

435 440 445 

Ser Ala Lys Leu Thr Pro val ser Glu Leu Val Phe Glu Gly Asn Ser 

450 455 460 

Asp Gly Lys Ala lie ser Val Asn Gly Arg Ser Asn Asn Trp Asp Gly 
465 470 475 480 

Val Asp lie Pro Phe Ser ser Val Ser Met Gin Asn Gly Lys Ala Tyr 

485 490 495 

Thr lie Thr Val Thr Gly Phe Val Tyr Ser Ser Val Ser val Pro Glu 

500 505 510 

Gly Ala Gin Ala Leu Leu Gin Asn val Asp ser Tyr Asn Gly Leu Tyr 

515 520 525 

Ala Ala Ala Asp Val Lys Ala Gly Gin Thr phe Thr Leu Thr Gly Gin 

530 535 540 

Tyr Thr Val Asp Thr Ser Lys Asp Arg Ala Leu Arg lie Gin Ser Asn 
545 550 ~ 555 ~ 560 

Asp Ala Gly Lys Thr Val Pro Phe Tyr lie Gly Asp lie Leu He Thr 

565 570 575 

Glu Lys Ala Ala Ser Gly Gly Gly Gly Asp Asp Gly Arg Leu Pro Ala 

580 585 590 

Glu Pro Phe Thr Ala lie Asn Phe Glu Asp Gin Asn Met Gly Gly Phe 

595 600 605 

Glu Gly Arg Ala Gly Thr Glu Thr Leu Thr val Thr Asn Glu Ala Asn 

610 615 620 

His Thr Asp Gly Gly ser Tyr Ala Leu Lys val Glu Gly Arg ser Gin 
625 630 635 640 

Ala Trp His Gly Pro Ala Leu His val Glu Lys Tyr Val Asp Lys Asp 

645 650 655 

Ser Glu Tyr Lys lie Ser Ala Trp Val Lys Leu He Ser Pro Ala Thr 

660 665 670 

Ser Gin Leu Gin Leu Ser Thr Gin Val Gly Asn Gly Gly Thr Ala Ser 

675 680 685 

Tyr Asn Asn Leu Gin Gly Lys Thr He Ser Thr Glu Asp Gly Trp Val 

690 695 700 

Lys Leu Glu Gly Thr Tyr Arg Tyr ser ser val Gly Asp Glu Phe Leu 
705 710 715 720 

Thr lie Tyr Val Glu Ser ser Asn Asn Ser Thr Ala Ser Phe Tyr lie 

725 730 735 

Asp Asp He Thr Phe Glu ser Thr Gly Ser Gly Pro lie Glu Val Glu 

740 745 750 

Asp Leu Thr Pro He Lys Asp Val Tyr Gin Asp Asp Phe Leu He Gly 

755 760 765 

Asn Ala val ser Ala Ser Asp Leu Glu Gly Asn Arg Leu Lys Leu Leu 

770 775 780 

Asn Met His His Asn Val Val Thr Ala Glu Asn Ala Met Lys Pro Asp 
785 790 795 800 

Gin Ala Tyr Asn Ala Glu Lys Gin Phe Asp Phe Thr Asp Glu Asn Ala 

805 810 815 

Leu val Asp Lys val Leu Asp Gin Gly Leu Gin Leu His Gly His val 

820 825 830 

Leu val Trp His Gin Gin Thr Pro Glu Trp Leu Phe Thr Ala Glu Asn 

835 840 845 

Gly Ala Pro Leu ser Arg Glu Ala Ala Leu Ala Asn Leu Arg Thr His 

850 855 860 

val Lys Thr val Val Glu Asn Tyr Gly Asn Lys Val lie ser Trp Asp 

Page 60 



WO 03/106654 PCT/US03/19153 

865 870 875 880 

val Val Asn Glu Ala He He Asp Asn Pro Pro Asn Pro Thr Asp Trp 

885 890 895 

Lys Ala ser Leu Arg Lys ser Gly Trp Tyr Lys ser lie Gly Pro Asp 

900 905 910 

Phe val Glu Gin Ser Phe Leu Ala Ala Lys Glu val Leu Asn Glu Lys 

915 920 925 

Gly Leu Asn lie Lys Leu Tyr Tyr Asn Asp Tyr Asn Asp Asp Asn Gin 

930 935 940 

Ser Lys Ala Glu Ala lie Tyr Gin Met Val Lys Asp lie Asn Glu Lys 
945 950 955 960 

Tyr Ala Lys Glu His Asp Gly Asp Leu Leu He Asp Gly lie Gly Met 

965 970 975 

Gin Ala His Tyr Asn Lys Asn Thr Asn Pro Glu Asn Val Lys Leu Ser 

980 985 990 

Leu Glu Lys Phe lie Thr Leu Gly Val Glu Val Ser val Thr Glu Leu 

995 1000 1005 

Asp lie Thr Ala Gly Thr Asn Asn Val Leu Thr Glu Lys Glu Ala lie 

1010 1015 1020 

Ala Gin Gly Tyr Leu Tyr Ala Gin Leu Phe Lys lie Tyr Lys Glu His 
1025 1030 1035 1040 

Ala Glu His lie Ser Arg Val Thr Phe Trp Gly Leu Asn Asp Ala Thr 

1045 1050 1055 

Ser Trp Arg Ala Ala Gin Ser Pro Leu Leu Phe Asp Lys Asp Leu Gin 

1060 1065 1070 

Ala Lys Pro Ala Tyr Tyr Ala Val He Asp Pro Asp Thr Phe Thr Val 

1075 1080 1085 

Glu Asn Gin Pro Glu val Arg Glu Ala Asn Gin Gly Ser Ala val Ser 

1090 1095 1100 

Gly Thr Pro Val lie Asp Gly Thr Val Asp Gly val Trp ser Asn Ala 
1105 1110 1115 1120 

Thr Glu Leu Pro He Asn Arg Phe Gin Met Ala Trp Gin Gly Ala Asn 

1125 1130 1135 

Gly val Ser Lys Val Leu Trp Asp Asn Glu Asn Leu Tyr Val Leu He 

1140 1145 1150 

Gin val ser Asp ser Gin Leu Asp Lys ser ser Pro Asn Pro Trp Glu 

1155 1160 1165 

Gin Asp Ser lie Glu val Phe val Asp Glu Asn Asn Ala Lys Thr Ser 

1170 1175 1180 

Ser Phe Glu Asp Gly Asp Gly Gin Tyr Arg val Asn phe Asp Asn Glu 
1185 1190 1195 1200 

Thr ser Phe Asn Pro Val Arg val Gly Glu Gly Phe Glu ser Ala Thr 

1205 1210 1215 

Lys Ala ser Gly Asn Gly Tyr Thr val Glu val Lys He pro Phe Lys 

1220 1225 1230 

Thr He Thr Pro Asp Asn Asn Thr Lys lie Gly Phe Asp val Gin lie 

1235 1240 1245 

Asn Asp Gly Lys Asp Gly Ala Arg Gin Ser Ala Ala Thr Trp Asn Asp 

1250 1255 1260 

Leu Thr Gly Leu Gly Tyr Gin Asp Thr ser val Phe Gly Val Leu Thr 
1265 1270 1275 1280 

Leu Met Lys Thr Asp Thr Thr Ala pro Val Thr Thr Asp Asn Gly Pro 

1285 1290 1295 

Glu Asp Trp val Asn Lys Asp Val Thr lie Ala Phe Ser Ala Asn Asp 

1300 1305 1310 

Asn Asp Thr Gly Val Ala Ala Thr Tyr Tyr Ser He Asp Asn Gly Val 

1315 1320 1325 

val Gin Asn Gly Asn ser val Thr He ser Glu Glu Gly val His He 

1330 1335 1340 

Leu Thr Tyr Trp Ser Val Asp Lys Ala Gly Asn val Glu Gin Val His 
1345 1350 1355 1360 

Thr Lys Thr lie Lys Leu Asp Lys Thr Gly Pro He Leu Asp He Lys 

1365 1370 1375 

Leu Asp Lys Thr Thr Leu Ser Pro val Asn His Lys Met val Pro He 

1380 1385 1390 

ser Ala Ala lie ser Ala ser Asp Ala Asp ser Gly He His ser Val 

1395 1400 1405 

Val Leu Thr Ser He Thr ser Asn Glu Ser He Gin Pro Asp Asp lie 
1410 1415 1420 
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Gin Asn Ala Asn Tyr Asn Lys Pro He Thr Gly Thr Thr Asp ser Phe 
1425 1430 1435 1440 

Lys Leu Arg Ala Glu Arg Leu Ala Asn Gly Asn Gly Arg Val Tyr Thr 

1445 1450 1455 

lie Thr Tyr Thr Ala Thr Asp Lys Ala Gly Asn val Thr Thr Lys ser 

1460 1465 1470 

Val Glu Val Ser Val Pro Arg Asp Asn ser Lys Lys 
1475 1480 

<210> 75 
<211> 1122 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 



<400> 75 

atgaaaaagc 

tcggcgtcgg 

ctcaatgccg 

ttcgattctc 

gtttacgatt 

ttggttggcc 

aagggcgaga 

gtggtcggcc 

gaagacggtt 

ctggcgttcc 

aatgtgtcca 

aaaggagtta 

ctcgaccagc 

accgagctcg 

tcgctcagct 

gtttcgcagc 

agccacatcg 

tggccgatcg 

gcggccttcg 



atattgtact 
agcgatttct 
atcagattac 
ttgtcgctga 
tccgggtggc 
acacactgct 
ccgccacgcg 
gctaccaggg 
cgttgcggga 
gcatggcgaa 
agcccggcaa 
aggtcgatgc 
tcgaggccag 
atgtgtcggt 
ttgagatgca 
agctagctga 
accgcgtgac 
cgggcaggac 
aggcggtggt 



cttcgcattt 
caaggacgtc 

gggggcggac 

aaatgcgatg 
tgacgccctg 
ctggcatcag 
ggagctggtg 
ccgggtgcag 
gtcgaaatgg 
ggaggccgat 
gcgaggtgga 
ggtcggcatc 
catttctgcg 
cttgcccttt 
ggaccacctc 
acgttacgcg 
gttttgggga 
cgactatccc 
cgatttagcg 



ctttccgtga 
ttttcggatt 
tcggccagcc 
aagtgggggt 
gtcgatttgg 
cagacgccgg 
ctcgctcgac 
ggctgggatg 
ttgcagatca 
cccgacgccg 
gtggtgcgcc 
cagggccact 
ataacggagg 
cccgacgcgg 
aatccctatg 
gccatttttg 
gtgcacgacg 
ttgctgtttg 
gagggccgct 



ttttgctggc 
ccttcaaggt 
tcgacttgtc 
cgctcaatcc 
cggagcggga 
actgggtttt 
tggagacgca 
tggtcaacga 
tcggcccgga 
agctttatta 
tgcttggaga 
acagtctcgg 
ctggggctcc 
agcaaatggg 
ccgatggctt 
aagtgttttt 
gggtcagctg 
atcgggagct 
ga 



ggcccgatcg 
cggcgtagcc 
cttggctcac 
tgagccgggg 
aggtttgttt 
tctggacgag 
catccgcacc 
agccttgaac 
ctacatcgaa 
caatgactac 
gctgcaggcg 
gcaccctgag 
gatcatgata 
ggcggacgtg 
gcccgaggcg 
gcgccaccag 
gtggaactat 
caagcggaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1122 



<210> 76 
<211> 373 
<212> PRT 
<213> Unknown 



<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (1)... (22) 

<400> 76 

Met Lys Lys His lie 

1 5 
Ala Ala Arg Ser Ser 
20 

Asp ser Phe Lys Val 
35 

Ala Asp ser Ala ser 
50 

Val Ala Glu Asn Ala 
65 

Val Tyr Asp Phe Arg 
85 

Glu Gly Leu Phe Leu 
100 

Pro Asp Trp val Phe 
115 

Leu val Leu Ala Arg 
130 

Tyr Gin Gly Arg val 



an environmental sample 



val Leu Phe 

Ala ser Glu 

Gly Val Ala 
40 

Leu Asp Leu 
55 

Met Lys Trp 
70 

val Ala Asp 

val Gly His 

Leu Asp Glu 
120 

Leu Glu Thr 

135 
Gin Gly Trp 



Ala Phe Leu ser Val lie 
10 

Arg Phe Leu Lys 
25 

Leu Asn Ala Asp 



Ser Leu Ala His 
60 

Gly ser Leu Asn 

75 

Ala Leu Val Asp 
90 

Thr Leu Leu Trp 
105 

Lys Gly Glu Thr 

His lie Arg Thr 
140 

Asp val Val Asn 
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Asp val 
30 

Gin lie 
45 
Phe Asp 



Pro Glu 
Leu Ala 



His Gin 
110 
Ala Thr 
125 

val val 



Leu Leu 
15 

Phe Ser 

Thr Gly 

ser Leu 

Pro Gly 

80 
Glu Arg 
95 

Gin Thr 



Arg Glu 
Gly Arg 
Glu Ala Leu Asn 
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145 150 
Glu Asp Gly Ser Leu Arg 
165 

Asp Tyr He Glu Leu Ala 
180 

Ala Glu Leu Tyr Tyr Asn 
195 

Gly Gly val Val Arg Leu 
210 

Val Asp Ala val Gly lie 
225 230 
Leu Asp Gin Leu Glu Ala 
245 

Pro lie Met lie Thr Glu 
260 

Ala Glu Gin Met Gly Ala 
275 

His Leu Asn Pro Tyr Ala 
290 

Leu Ala Glu Arg Tyr Ala 
305 310 
ser His lie Asp Arg val 
325 

Trp Trp Asn Tyr Trp Pro 
340 

Phe Asp Arg Glu Leu Lys 
355 

Leu Ala Glu Gly Arg 
370 

<210> 77 
<211> 1248 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 77 

atgctaaaag ttttacgtaa acctatcgtt tctggattag ctctagcctt attattacct 60 

ataggatcga cagttagtgc cgaaacaaat atttcaaata aaccaggtat tagcgggtta 120 

acagcaccac aattggacca acgatataaa gattctttca ccataggtgc agcggttgag 180 

ccaaatcaat tattagatgc aaaagactca caaatgttaa agcgccattt taatagcatt 240 

gtagcagaaa atgtcatgaa gcctagcagt ttacagccag tagaagggca gtttaactgg 300 

gaaccggcag ataaacttgt taagtttgcg aaagaaaatg gaatggacat gcgcggccat 360 

acgcttgtct ggcatagcca agtaccagat tggttcttca aagatgcaaa tggaaattca 420 

atggttgttt ggcagaatgg aaagcaagtg gttgcagatc cgtcaaatct tgaggctaac 480 

aaaaagcttt tattaagccg tttagaaaca catgttaata cagtcgtttc tcgttataaa 540 

aatgatatta aattttggga cgttgtcaat gaagtaatcg acgaatgggg cggacatcct 600 

gaaggtttac gtcaatctcc atggttccta attaccggaa cggactatat taaagtcgct 660 

tttgagacag caagacaata tgctgctcca gacgctaagc tttatatcaa tgattacaat 720 

acagaagtaa caccaaaaag aacgtactta tacaacctag taaaaagttt aaaacagcaa 780 

ggtgttccaa ttgatggtgt tgggcatcag tctcacattc aaatcggctg gccgtctgaa 840 

aaagaaattg aagacacaat taacatgttt gctgaactgg ggttagacaa ccaaattact 900 

gagcttgatg taagcatgta tggctggcca gtaagggcgt atcctaccta tgattctatt 960 

ccagcacaga aatttataga tcaagcagac cgatatgatc gtttatttaa attatatgag 1020 

aaattaggcg ataaaatcag caatgtgaca ttctggggaa ttgctgataa ccatacatgg 1080 

ttaaatgacc gtgcagatgt ttactatgat gcagatggaa acgttgtaac attggcaaat 1140 

gcaccatatg ctaaaatgga agctagatca ggtaaagatg caccatttgt atttgatcca 1200 

gaatacaatg taaaaccagc ctattgggcg attatcgacc acaaataa ~ 1248 

<210> 78 
<211> 415 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
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Glu Ser 

Phe Arg 

Asp Tyr 
200 
Leu Gly 
215 

Gin Gly 

Ser lie 

Leu Asp 

Asp val 
280 
Asp Gly 
295 

Ala lie 

Thr Phe 

lie Ala 

Arg Lys 
360 



Lys Trp 
170 
Met Ala 
185 

Asn Val 

Glu Leu 

His Tyr 

Ser Ala 
250 
val Ser 
265 

ser Leu 

Leu Pro 

Phe Glu 

Trp Gly 
330 
Gly Arg 
345 

Ala Ala 



155 

Leu Gin 

Lys Glu 

Ser Lys 

Gin Ala 
220 
ser Leu 
235 

lie Thr 

Val Leu 

ser Phe 

Glu Ala 
300 
Val Phe 
315 

Val His 
Thr Asp 
Phe Glu 



lie lie 

Ala Asp 
190 
Pro Gly 
205 

Lys Gly 

Gly His 

Glu Ala 

Pro Phe 
270 
Glu Met 
285 

Val ser 

Leu Arg 

Asp Gly 

Tyr Pro 
350 
Ala Val 
365 



160 
Gly Pro 
175 

Pro Asp 

Lys Arg 

val Lys 

Pro Glu 
240 
Gly Ala 
255 

Pro Asp 

Gin Asp 

Gin Gin 

His Gin 
320 
val Ser 
335 

Leu Leu 
Val Asp 
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<221> SIGNAL 
<222> (1)...(27) 

<400> 78 nil 
Met Leu Lys Val Leu Arg Lys Pro lie Val ser Gly Leu Ala Leu Ala 
15 10 15 

Leu Leu Leu Pro lie Gly Ser Thr val Ser Ala Glu Thr Asn lie Ser 

20 25 30 

Asn Lys Pro Gly lie Ser Gly Leu Thr Ala Pro Gin Leu Asp Gin Arg 

35 40 45 

Tyr Lys Asp Ser Phe Thr lie Gly Ala Ala Val Glu Pro Asn Gin Leu 

50 55 60 

Leu Asp Ala Lys Asp ser Gin Met Leu Lys Arg His Phe Asn Ser lie 
65 70 75 80 

Val Ala Glu Asn val Met Lys Pro Ser Ser Leu Gin Pro Val Glu Gly 

85 90 95 

Gin Phe Asn Trp Glu Pro Ala Asp Lys Leu Val Lys Phe Ala Lys Glu 

100 105 110 

Asn Gly Met Asp Met Arg Gly His Thr Leu val Trp His ser Gin Val 

115 120 n 125 

pro Asp Trp Phe Phe Lys Asp Ala Asn Gly Asn Ser Met Val val Trp 

130 135 140 

Gin Asn Gly Lys Gin Val val Ala Asp Pro Ser Asn Leu Glu Ala Asn 
145 150 155 160 

lvs Lys Leu Leu Leu Ser Arg Leu Glu Thr His val Asn Thr val Val 

165 170 175 

Ser Arg Tyr Lys Asn Asp lie Lys Phe Trp Asp val val Asn Glu Val 

180 185 190 

He Asp Glu Trp Gly Gly His Pro Glu Gly Leu Arg Gin Ser Pro Trp 

195 200 205 

Phe Leu lie Thr Gly Thr Asp Tyr lie Lys Val Ala Phe Glu Thr Ala 

210 215 220 

Arg Gin Tyr Ala Ala Pro Asp Ala Lys Leu Tyr lie Asn Asp Tyr Asn 
225 230 235 n 240 

Thr Glu Val Thr Pro Lys Arg Thr Tyr Leu Tyr Asn Leu Val Lys Ser 

245 ~ 250 . 255 . 

Leu Lys Gin Gin Gly val Pro lie Asp Gly val Gly His Gin ser His 

260 265 270 

He Gin He Gly Trp Pro Ser Glu Lys Glu He Glu Asp Thr He Asn 

275 280 285 

Met Phe Ala Glu Leu Gly Leu Asp Asn Gin lie Thr Glu Leu Asp val 

290 295 300 

Ser Met Tyr Gly Trp Pro val Arg Ala Tyr Pro Thr Tyr Asp Ser lie 
305 310 315 320 

Pro Ala Gin Lys Phe lie Asp Gin Ala Asp Arg Tyr Asp Arg Leu Phe 

325 330 _ 335 

Lys Leu Tyr Glu Lys Leu Gly Asp Lys He Ser Asn Val Thr Phe Trp 

340 345 n 350 

Gly lie Ala Asp Asn His Thr Trp Leu Asn Asp Arg Ala Asp val Tyr 

355 360 „ 365 

Tvr Asp Ala Asp Gly Asn val val Thr Leu Ala Asn Ala Pro Tyr Ala 

370 375 380 

Lys Met Glu Ala Arg Ser Gly Lys Asp Ala Pro Phe Val Phe Asp Pro 
385 390 395 400 

Glu Tyr Asn val Lys Pro Ala Tyr Trp Ala He lie Asp His Lys 
405 410 415 

<210> 79 
<211> 1293 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 

atgattggtc tggatttgat ttctggtggt cgtcgcaagg cctgtctggc tgcctgtctg 60 
gcqcttgccg cgctgtcatt gccggtatcg gctcaaatgg ctgcggggaa ggaaaagttc 120 
gtgggtaacg tgatcgctgg ttatgtgccc ggtgattacg gcaatctctg gaatcaggtg 180 
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acgccggaga attccaccaa gtggggagcg gttgagtcta cgcgtaatgt catgaactgg 240 

acgcaggctg atctggccta caactacgcc aagtccaagg gcttcaagtt caagatgcac 300 

acgctggtat ggggctcgca agagccggcc tgggtcaaga atctggatgc gacttcccag 360 

cgtgtcgagg tcgaacagtg gatgcgtctg agctgcgaac gctaccccga ttcctgggct 420 

atcgatgtgg tgaatgaacc cctgcatgcc gtgccctcgt acaagaacgc actgggtggc 480 

gatggtgcca ccggctggga ttgggtcatc acctcgttcc gtctggcgcg tcagtactgt 540 

ccgcgcgcca agctgctgct caatgagtac gccaccgagc tggatgccag caagcgcgcc 600 

aagatcaaga ccattgcctc gctgctcaag agtcgcggtc tgattgatgg tgttggcctg 660 

caggcccatt tcttcacgct ggattacatg aatgccagcc agatgaaggc ggcactggat 720 

gattacgcca cgctgggtgt ggatatctac atttccgagc tggatctgaa gggcagtgcc 780 

aataccgacg ccagccagaa ggcgaagtac gaagagctgt tcccggtgat gtggaatcac 840 

gccagcgtga agggcatcac cctgtggggc tacaaggtgg gtgaaacctg gtcgagcggc 900 

accggcctgc tgaatgcgaa cggtagcgag cgtccggccc tgacctggct gaaaagctat 960 

atgagcagcc gtcctgcagc atcgagcagc agttcttcga gtgtttcatc cagcaaatcc 1020 

agttcgtctt cttctagcca gtccagtgcc tccagcagtg caggcagtgc gccggtcttg 1080 

tccggcacca gtgattaccc gagcggtttc agcaagtgtg ccgatctggg cggcacttgc 1140 

agcgtgtctt ccggcaccgg ctgggcggcc ttcgggcgca agggtaagtg ggttgccaaa 1200 

tacgtcggtg tgggcaagag cattccctgc acggtggcgg cgtttggtcg tgacccgggg 1260 

ggcaatccca acaagtgttc cttccagagg taa ~ 1293 

<210> 80 

<211> 430 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (l).-.(36) 

<400> 80 

Met lie Gly Leu Asp Leu lie ser Gly Gly Arg Arg Lys Ala Cys Leu 

1 5 10 15 

Ala Ala Cys Leu Ala Leu Ala Ala Leu ser Leu Pro Val ser Ala Gin 

20 25 30 

Met Ala Ala Gly Lys Glu Lys Phe Val Gly Asn Val He Ala Gly Tyr 

35 40 45 

Val Pro Gly Asp Tyr Gly Asn Leu Trp Asn Gin Val Thr Pro Glu Asn 

50 55 60 

Ser Thr Lys Trp Gly Ala val Glu Ser Thr Arg Asn val Met Asn Trp 
65 70 75 80 

Thr Gin Ala Asp Leu Ala Tyr Asn Tyr Ala Lys Ser Lys Gly Phe Lys 

. 85 90 95 

Phe Lys Met His Thr Leu val Trp Gly ser Gin Glu Pro Ala Trp Val 

100 105 110 

Lys Asn Leu Asp Ala Thr ser Gin Arg val Glu Val Glu Gin Trp Met 

115 120 125 

Arg Leu Ser Cys Glu Arg Tyr Pro Asp ser Trp Ala lie Asp val val 

130 135 140 

Asn Glu Pro Leu His Ala Val Pro Ser Tyr Lys Asn Ala Leu Gly Gly 
145 ■ n 150 155 160 

Asp Gly Ala Thr Gly Trp Asp Trp Val lie Thr Ser Phe Arg Leu Ala 

165 170 175 

Arg Gin Tyr cys Pro Arg Ala Lys Leu Leu Leu Asn Glu Tyr Ala Thr 

180 185 190 

Glu Leu Asp Ala Ser Lys Arg Ala Lys lie Lys Thr lie Ala ser Leu 

195 200 205 

Leu Lys ser Arg Gly Leu He Asp Gly val Gly Leu Gin Ala His Phe 
_ 210 215 220 

Phe Thr Leu Asp Tyr Met Asn Ala Ser Gin Met Lys Ala Ala Leu Asp 
225 230 235 240 

Asp Tyr Ala Thr Leu Gly Val Asp lie Tyr lie Ser Glu Leu Asp Leu 

245 250 255 

Lys Gly Ser Ala Asn Thr Asp Ala Ser Gin Lys Ala Lys Tyr Glu Glu 

260 265 270 

Leu Phe Pro val Met Trp Asn His Ala ser Val Lys Gly He Thr Leu 

„ 275 280 7 285 

Trp Gly Tyr Lys val Gly Glu Thr Trp ser Ser Gly Thr Gly Leu Leu 
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290 295 300 

Asn Ala Asn Gly Ser Glu Arg Pro Ala Leu Thr Trp Leu Lys ser Tyr 
305 310 315 320 

Met ser Ser Arg Pro Ala Ala Ser Ser Ser Ser Ser Ser Ser Val ser 

325 330 335 

Ser ser Lys ser ser Ser ser Ser Ser Ser Gin Ser Ser Ala Ser Ser 

340 345 350 

Ser Ala Gly Ser Ala Pro val Leu Ser Gly Thr Ser Asp Tyr pro Ser 

355 360 365 

Gly Phe ser Lys Cys Ala Asp Leu Gly Gly Thr cys Ser val Ser Ser 

370 375 380 

Gly Thr Gly Trp Ala Ala Phe Gly Arg Lys Gly Lys Trp val Ala Lys 
385 390 395 400 

Tyr val Gly val Gly Lys Ser lie Pro Cys Thr Val Ala Ala Phe Gly 

405 410 415 

Arg Asp pro Gly Gly Asn Pro Asn Lys cys ser Phe Gin Arg 
420 425 430 

<210> 81 
<211> 1017 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 81 

ttgaccacga gagctattcg cacggaggca gcgctgaagg agatgtttgc ggaggacttt 60 

cagatcggag ccgctgttaa tccgatgact atacggacac aggaggagct gcttgcttat 120 

cacttcaaca gtattacggc agagaatgaa atgaagtttg ccagtctgca gccggaggag 180 

ggggcttatg cttttgacga ggcggatcga ttggcggcct tcgcccggaa gcatggcatg 240 

gcgatgcggg gacacacttt agtgtggcat aaccagtcca caggctggct gttcgaagac 300 

aagcagggaa atcctgtaga taaggcaact ctgctggaga ggctgaaatc gcacatccat 360 

acggtagtag gacgttataa aaacgatatt tatgcttggg atgtggtaaa cgaggttata 420 

gaggacgagg gagacggcct gctgcgccgg tcgaaatggc tggatattgc cggaccggaa 480 

ttcattgccc gggcgttcga gtatgctcat gaggctgacc ctaatgcgct gctcttctat 540 

aatgactaca acgagtccaa tccggcgaag cgagacaaga tccatgctct ggtgaagtcg 600 

ctgctggagc aaggcgtgcc tattcatggc attggactgc aggcgcattg gaatttgtat 660 

ggtccttctc tcggcgagat ccgagcggca ctggagaagt atgcttctct tggcctgcag 720 

ctgcagctta cggagctgga tatgtcgctg tttcgttttg acgacaagcg tacggatata 780 

accgagcctc cggcggaatt gcttgagctg caggctgagc ggtatgagga aattttcaag 840 

ctgctgaggg aataccggga tgtaatcact tccgtgacct tctggggggc tgcggatgat 900 

tatacgtggc tgaacgattt tcccgtccgg gggcggaaaa attggccttt cctgttcgat 960 

gagcagcatc accccaaact ggcatttcat cgggtcgctg cactttcccg ccagtga 1017 

<210> 82 
<211> 338 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 82 

Leu Thr Thr Arg Ala lie Arg Thr Glu Ala Ala Leu Lys Glu Met Phe 
1 5 10 15 

Ala Glu Asp Phe Gin lie Gly Ala Ala Val Asn Pro Met Thr lie Arg 

20 25 30 

Thr Gin Glu Glu Leu Leu Ala Tyr His Phe Asn Ser lie Thr Ala Glu 

„ 35 40 45 

Asn Glu Met Lys Phe Ala ser Leu Gin Pro Glu Glu Gly Ala Tyr Ala 

50 55 60 

Phe Asp Glu Ala Asp Arg Leu Ala Ala Phe Ala Arg Lys His Gly Met 
65 70 75 80 

Ala Met Arg Gly His Thr Leu val Trp His Asn Gin ser Thr Gly Trp 

85 90 95 

Leu Phe Glu Asp Lys Gin Gly Asn Pro Val Asp Lys Ala Thr Leu Leu 

100 105 no 

Glu Arg Leu Lys ser His lie His Thr Val Val Gly Arg Tyr Lys Asn 
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115 120 125 

Asp lie Tyr Ala Trp Asp Val Val Asn Glu val lie Glu Asp Glu Gly 

130 135 140 

Asp Gly Leu Leu Arg Arg Ser Lys Trp Leu Asp lie Ala Gly pro Glu 
145 150 155 160 

Phe He Ala Arg Ala Phe Glu Tyr Ala His Glu Ala Asp Pro Asn Ala 

165 170 175 

Leu Leu Phe Tyr Asn Asp Tyr Asn Glu ser Asn Pro Ala Lys Arg Asp 

180 185 190 

Lys lie His Ala Leu Val Lys Ser Leu Leu Glu Gin Gly val pro lie 

195 200 205 

His Gly lie Gly Leu Gin Ala His Trp Asn Leu Tyr Gly Pro ser Leu 

210 215 220 

Gly Glu lie Arg Ala Ala Leu Glu Lys Tyr Ala Ser Leu Gly Leu Gin 
225 230 235 240 

Leu Gin Leu Thr Glu Leu Asp Met ser Leu Phe Arg Phe Asp Asp Lys 

245 250 " 255 

Arg Thr Asp He Thr Glu Pro Pro Ala Glu Leu Leu Glu Leu Gin Ala 

260 265 270 

Glu Arg Tyr Glu Glu lie Phe Lys Leu Leu Arg Glu Tyr Arg Asp Val 

275 280 285 

lie Thr Ser Val Thr Phe Trp Gly Ala Ala Asp Asp Tyr Thr Trp Leu 

290 295 300 

Asn Asp Phe Pro val Arg Gly Arg Lys Asn Trp Pro Phe Leu Phe Asp 
305 310 315 320 

Glu Gin His His Pro Lys Leu Ala Phe His Arg Val Ala Ala Leu Ser 
325 330 335 

Arg Gin 



<210> 83 
<211> 3024 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 83 

atgaaaacca aaaggtctat attcaggttg tctatcctgg ttgtcctggc tgtgctgctg 60 

ttcagcgcaa tcaccttgac agccagcgcc gccgacacgc tcggcgcggc ggcggcccag 120 

tcgggccggt acttcggcac ggcgatagct gccggcaagc tcggcgactc gacctacacg 180 

accattgcca accgtgagtt caacatgatc acggctgaga atgagatgaa gatcgacgcc 240 

accgagccga accagaacca attcaacttc accaacgccg accggatctt caactgggcg 300 

gtgcagaatg ggaagcaggt gcgcgggcac acgctggcat ggcactcgca gcagccgggg 360 

tggatgagca gcatgagcgg caccgcgctg cgcaatgcga tgatcaacca catcaatggg 420 

gtgatggccc actacaaggg caggatctac gcctgggatg tggtgaacga ggctttcaac 480 

gaggacggca gccgccgcaa ctcgaacctg cagcagaccg gcaacgactg gatcgaggtg 540 

gccttccgga cagcccgcac cgccgacccg gccgccaagc tgtgctacaa cgactacaac 600 

atcgaagcct ggagctatgc caagacgcag ggcgtttacc ggatggtcca ggacttcaag 660 

tcccgcggcg tgccgatcga ctgtgtcggg ttccagagcc acttcaacag cggcacttcc 720 

tacgtcaaca gcaacttccg gacgacgctg caaagcttcg ccgcgctggg cgtggacgtg 780 

cagatcaccg agctggatgt cgagaatgcc gactcgcggc tcgattggtg gagaggcatc 840 

gtcaatgact gcctggcggt cccgcgctgc aacggcatca cggtgtgggg cgtgcgcgac 900 

agcgattcgt ggcgctcttc gcagaacccg ctgctgttca actccagcgg tggtaagaag 960 

gcttcgtaca ccgccgtcct cgacgccctc aacgctgccc cgaccgtcac acctccggta 1020 

acgacacctc cggtgacgac accgccagtg accacgcctc ctcccggcac tgtgtcgatt 1080 

aacgcgggcg gctcggcgag cggcagcttc acggccgacc agtacttcag cggtggcagc 1140 

acctacacca acaccgccac catcgacatg agtcagatca ccagcaaccc accgccggcg 1200 

gcggtcttca acagcgagcg ttacggggcg atgacctaca ccatccccaa ccgctcgggt 1260 

gctcagacgg tgacgctgta ctttgccgag acctacctca ccgcggcagg gcagcggtcg 1320 

ttcaacgtgt cgattaatgg cgcagcggcg ctgtccaact tcgacatcta tgcctcggca 1380 

ggtggcgcta accgggccat cgcccggacg ttcagcacca cggctaactc aagtggccag 1440 

gtggtgatcc agttcacggc ggttaccgag aaccccaaga tcaacgctat cacggtaaca 1500 

gcgggtggca cgcctccacc gacaacgcct ccgcccacca cgccgccacc gaccacccct 1560 

ccggtgacga cacccccagt gacgacaccc ccagtgacga caccgccccc cggcagcgtg 1620 

tcgatcaacg cgggcggctc ggccaccggc agcttcacgg gcgaccagta ctttagcggt 1680 

ggcagcacct acaccaacac cgccaccatc gacatgagcc agatcaccag caacccacca 1740 

ccggcggcgg tgttcaacag cgagcgctac ggggcgatga cctacaccat ccccggccgc 1800 
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tcgggggctc agacggtcac gctttacttt gccgaaacgt atgtcactgc ggcagggcag 1860 

cgcgtcttta acgtgtctgt aaacggcgcg gcagcgctgt ccaacttcga catctatgcc 1920 

agcgccggcg gccagaaccg ggccatcgct cgctccttca acaccacggc caactcaagc 1980 

ggccaggtgg tgatccagtt cacggcggtc accgagaacc ccaagatcaa cgccatcact 2040 

gtggcgggcg ggatcgggga cttccaaacc ctgaccgtca cgaagtccgg cacggggacg 2100 

gtcacctcca acccggctgg tatcaactgc ggctcgacct gcaacgccag cttcgctacc 2160 

ggcaccagcg tgaccctgac cgcctccggc gggaccttca ccggctggag cggagcctgc 2220 

tccggcacct ccaccacctg caccgtctcc atgacccagg cccggtcggt caccgctact 2280 

tttagcggcg gtggtgacac caggccgagc gcggggtgtg gtaagaaccg gacactgcag 2340 

aatggcacaa tcaccatttc aagtggcggc gtcaaccgca cctacatcct acgcacgcct 2400 

gacaactaca acaacacgca tgcataccgg ctgatcatgg cttatcactg gcttaacggc 2460 

agcgcgcaga atgtggcgag cgagaactac taccggctgt tcccactctc caacaacagc 2520 

accatcttcg tggcgcctca ggggctggat gccggatggg ctaacaccaa caaccgcgac 2580 

ctgaacctca ccgatgccat actcacccag gtcgagaacg atctgtgcgt cgacttgaac 2640 

cgggtctggg ccaccgggtt cagctacggc gcaggtatgt catacgccat cgcctgtgcc 2700 

agggccaatg tgttccgggg cgtcgctctc tatgccggcg cgcagctcag cggttgcacc 2760 

ggtggaacca cggccattgc gtacttcgca acgcacggca tcaacgacag tgtcctcaac 2820 . 

atctcgcaag ggcggactct acgcgaccgc tttgtctcga acaacagctg cacggcgcag 2880 

aaccctcccg agccttcctc gggcagcggg acgcacatct gcacgtccta ccagaactgc 2940 

tcggcaggac atcctgtccg gtggtgcgcg ttcgacggcg accacacccc gaatcagacc 3000 

gaccgcggcc agagcacaag ctaa 3024 

<210> 84 
<211> 1007 
<212> PRT 
<213> Unknown 

<22Q> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(30) 

<400> 84 

Met Lys Thr Lys Arg Ser lie Phe Arg Leu Ser He Leu Val val Leu 

1 , 5 10 15 

Ala val Leu Leu Phe ser Ala He Thr Leu Thr Ala Ser Ala Ala Asp 

20 25 30 

Thr Leu Gly Ala Ala Ala Ala Gin Ser Gly Arg Tyr Phe Gly Thr Ala 

35 40 45 

lie Ala Ala Gly Lys Leu Gly Asp Ser Thr Tyr Thr Thr lie Ala Asn 

50 55 60 

Arg Glu Phe Asn Met lie Thr Ala Glu Asn Glu Met Lys He Asp Ala 
65 ^ 70 75 80 

Thr Glu Pro Asn Gin Asn Gin Phe Asn Phe Thr Asn Ala Asp Arg lie 

85 90 95 

Phe Asn Trp Ala val Gin Asn Gly Lys Gin Val Arg Gly His Thr Leu 

100 105 110 

Ala Trp His Ser Gin Gin Pro Gly Trp Met ser ser Met Ser Gly Thr 

115 120 125 

Ala Leu Arg Asn Ala Met lie Asn His He Asn Gly Val Met Ala His 

130 135 140 

Tyr Lys Gly Arg lie Tyr Ala Trp Asp val Val Asn Glu Ala Phe Asn 
145 150 155 160 

Glu Asp Gly ser Arg Arg Asn Ser Asn Leu Gin Gin Thr Gly Asn Asp 

, , , 165 170 175 

Trp He Glu Val Ala Phe Arg Thr Ala Arg Thr Ala Asp Pro Ala Ala 

180 185 190 

Lys Leu cys Tyr Asn Asp Tyr Asn lie Glu Ala Trp Ser Tyr Ala Lys 

195 200 205 

Thr Gin Gly val Tyr Arg Met val Gin Asp Phe Lys Ser Arg Gly Val 

210 215 220 

Pro lie Asp Cys val Gly Phe Gin ser His Phe Asn ser Gly Thr ser 
225 230 235 240 

Tyr val Asn Ser Asn Phe Arg Thr Thr Leu Gin Ser Phe Ala Ala Leu 

245 250 255 

Gly val Asp val Gin lie Thr Glu Leu Asp Val Glu Asn Ala Asp Ser 

260 265 270 

Arg Leu Asp Trp Trp Arg Gly lie val Asn Asp cys Leu Ala Val Pro 
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275 280 285 

Arg cys Asn Gly lie Thr val Trp Gly Val Arg Asp Ser Asp Ser Trp 

290 295 300 

Arg ser ser Gin Asn Pro Leu Leu Phe Asn Ser Ser Gly Gly Lys Lys 
305 310 315 320 

Ala Ser Tyr Thr Ala Val Leu Asp Ala Leu Asn Ala Ala Pro Thr val 

325 330 335 

Thr Pro Pro Val Thr Thr Pro Pro Val Thr Thr Pro Pro val Thr Thr 

340 345 n n 350 

pro Pro Pro Gly Thr Val Ser lie Asn Ala Gly Gly Ser Ala Ser Gly 

355 360 365 

Ser Phe Thr Ala Asp Gin Tyr Phe Ser Gly Gly Ser Thr Tyr Thr Asn 

370 375 380 

Thr Ala Thr He Asp Met Ser Gin lie Thr ser Asn Pro pro Pro Ala 
385 390 395 400 

Ala Val Phe Asn ser Glu Arg Tyr Gly Ala Met Thr Tyr Thr He Pro 

405 ' 410 415 

Asn Arg ser Gly Ala Gin Thr val Thr Leu Tyr Phe Ala Glu Thr Tyr 

420 425 „ 430 

Leu Thr Ala Ala Gly Gin Arg Ser Phe Asn Val ser lie Asn Gly Ala 

435 440 445 

Ala Ala Leu Ser Asn Phe Asp lie Tyr Ala ser Ala Gly Gly Ala Asn 

450 455 460 

Arg Ala lie Ala Arg Thr Phe ser Thr Thr Ala Asn Ser ser Gly Gin 
465 470 475 480 

val Val lie Gin Phe Thr Ala Val Thr Glu Asn Pro Lys lie Asn Ala 

485 490 495 

lie Thr Val Thr Ala Gly Gly Thr Pro Pro Pro Thr Thr Pro Pro Pro 

500 505 510 

Thr Thr Pro Pro Pro Thr Thr pro pro val Thr Thr Pro Pro val Thr 

515 520 525 

Thr Pro Pro Val Thr Thr Pro Pro Pro Gly Ser Val Ser lie Asn Ala 

530 535 540 

Gly Gly Ser Ala Thr Gly ser Phe Thr Gly Asp Gin Tyr Phe ser Gly 
545 550 555 560 

Gly Ser Thr Tyr Thr Asn Thr Ala Thr lie Asp Met ser Gin lie Thr 

565 570 575 

ser Asn Pro Pro Pro Ala Ala val Phe Asn ser Glu Arg Tyr Gly Ala 

580 585 590 

Met Thr Tyr Thr lie Pro Gly Arg Ser Gly Ala Gin Thr val Thr Leu 

595 600 605 

Tyr Phe Ala Glu Thr Tyr val Thr Ala Ala Gly Gin Arg val Phe Asn 

610 615 620 

Val Ser Val Asn Gly Ala Ala Ala Leu ser Asn Phe Asp He Tyr Ala 
625 630 635 , , 640 

Ser Ala Gly Gly Gin Asn Arg Ala lie Ala Arg Ser Phe Asn Thr Thr 

645 " 650 655 

Ala Asn ser ser Gly Gin val val lie Gin Phe Thr Ala Val Thr Glu 

660 665 670 

Asn Pro Lys He Asn Ala He Thr Val Ala Gly Gly lie Gly Asp Phe 

675 680 685 

Gin Thr Leu Thr Val Thr Lys ser Gly Thr Gly Thr Val Thr ser Asn 

690 695 700 

Pro Ala Gly lie Asn cys Gly ser Thr Cys Asn Ala ser Phe Ala Thr 
705 710 715 , 720 

Gly Thr ser val Thr Leu Thr Ala Ser Gly Gly Thr Phe Thr Gly Trp 

725 730 735 

ser Gly Ala Cys Ser Gly Ttir Ser Thr Thr Cys Thr val Ser Met Thr 

740 745 750 

Gin Ala Arg Ser Val Thr Ala Thr Phe ser Gly Gly Gly Asp Thr Arg 

755 760 765 

Pro Ser Ala Gly cys Gly Lys Asn Arg Thr Leu Gin Asn Gly Thr He 

770 775 " 780 

Thr lie ser ser Gly Gly Val Asn Arg Thr Tyr lie Leu Arg Thr Pro 
785 790 795 , 800 

Asp Asn Tyr Asn Asn Thr His Ala Tyr Arg Leu lie Met Ala Tyr His 

K 805 810 815 

Trp Leu Asn Gly ser Ala Gin Asn val Ala Ser Glu Asn Tyr Tyr Arg 
M 820 825 830 
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Leu Phe Pro Leu 
835 

Leu Asp Ala Gly 
850 

Asp Ala lie Leu 
865 

Arg Val Trp Ala 

lie Ala cys Ala 
900 

Gly Ala Gin Leu 
915 

Phe Ala Thr His 
930 

Arg Thr Leu Arg 
945 

Asn Pro Pro Glu 



ser Asn 

Trp Ala 

Thr Gin 
870 
Thr Gly 
885 

Arg Ala 

ser Gly 

Gly lie 

Asp Arg 
950 
Pro ser 
965 

Ser Ala 



Tyr Gin Asn cys 
980 

Gly Asp His Thr Pro Asn 
995 

<210> 85 
<211> 1254 ' 
<212> DNA 
<213> Bacteria 



Asn ser Thr 

840 
Asn Thr Asn 
855 

Val Glu Asn 

Phe ser Tyr 

Asn Val Phe 
905 

cys Thr Gly 

920 
Asn Asp Ser 
935 

Phe Val Ser 

ser Gly Ser 

Gly His Pro 
985 

Gin Thr Asp 
1000 



lie Phe Val 

Asn Arg Asp 
860 

Asp Leu Cys 

875 
Gly Ala Gly 
890 

Arg Gly val 

Gly Thr Thr 

Val Leu Asn 
940 

Asn Asn ser 

955 
Gly Thr His 
970 

Val Arg Trp 
Arg Gly Gin 



Ala Pro 
845 

Leu Asn 

val Asp 

Met Ser 

Ala Leu 
910 
Ala lie 
925 

lie Ser 

Cys Thr 

He cys 

Cys Ala 
990 
Ser Thr 
1005 



Gin Gly 

Leu Thr 

Leu Asn 
880 
Tyr Ala 
895 

Tyr Ala 

Ala Tyr 

Gin Gly 

Ala Gin 
960 
Thr ser 
975 

Phe Asp 
Ser 



<400> 85 

atgaccttga 

tcgcgcgcct 

accctcgagt 

ctcgagatga 

agagtgggcg 

accaagcact 

ggcattgaga 

gcacagcaaa 

gagtggttct 

ctcagagaat 

gttgtgaacg 

cagatcatgg 

gatgcgaaac 

tacaaccttg 

cacatcagtc 

atccctggta 

tccaactacc 

ctctttgaaa 

aaagacgact 

gattatcagg 

tcaaaagaaa 



ttacgccaag 
gcaggtcgac 
tctacgtgga 
atccagaaga 
ttgctcttcc 
tcaacagcat 
atggcaaact 
acggcatggt 
tcaaagacga 
acatacacac 
aagcggtcga 
ggcctgacta 
tcttctacaa 
tgaagagtct 
ttgcaacgga 
tagaaatcca 
cagaggcacc 
tcttcaagaa 
actcctggag 
caaaactcgc 
gcaagatcca 



ctcgaaatta 
actagtggat 
cgatgtgaag 
ggaaatacca 
atccaaggta 
caccgcagaa 
caagttcaga 
tgtgaggggc 
aaatggaaac 
cgtcgttgga 
tccgaaccag 
catagaactt 
cgactacaac 
caaggaaaag 
catcaggcag 
cataacagag 
gaggaacgca 
atacagtaat 
agcaacaaga 
ttactgggcg 
aagaattcaa 



accctcacta 
ctcacacttt 
gtagtggaca 
gccctcaggg 
ttcatcaacc 
aatgagatga 
tttgaaacag 
cacacactgg 
ctcctctcca 
cacttcaaag 
ccagatggac 
gccttcaagt 
accttcgaac 
ggtctcatcg 
atcgaagagg 
ctcgatatga 
ctcattgaac 
gtgatcacaa 
agaaatgact 
attgtcgctc 
aaagcttctc 



aagggaacaa 
acttcgagtc 
ccacctctgc 
aagttctgaa 
agaaggactt 
aacctgatag 
cagacaaata 
tatggcacaa 
aagaagcgat 
ggaaggtcta 
tgagaagatc 
ttgcaaggga 
ccaaaaagag 
atggaatcgg 
ccatcaaaaa 
gcgtctacag 
aggctcacaa 
acgtcacgtt 
ggacattgat 
ctgaagtgct 
gagagtactt 



aagctggagc 
tcagaatccg 
tgagataaaa 
agactacttc 
aacgctcatc 
tctgcttgca 
catcgaattt 
tcagacgccc 
gacagaaaga 
cgcatgggac 
cacctggtat 
ggcagatccc 
agacatcatc 
tatgcagtgt 
gttcagctcc 
agattctact 
gatggctcaa 
ctggggtctc 
ctttgacaaa 
accacctctt 
ctag 



<210> 86 

<211> 417 

<212> PRT 

<213> Bacteria 

<400> 86 

Met Thr Leu lie Thr Pro ser 
1 5 
Lys ser Trp Ser Ser Arg Ala 
20 

Leu Tyr Phe Glu Ser Gin Asn 
35 

Val Lys Val val Asp Thr Thr 

50 55 
Pro Glu Glu Glu lie Pro Ala 
65 , 70 

Arg val Gly val Ala Leu Pro 
85 



Ser Lys Leu Thr Leu Thr Lys Gly Asn 

10 15 
Cys Arg ser Thr Leu Val Asp Leu Thr 

25 30 
Pro Thr Leu Glu Phe Tyr Val Asp Asp 
40 45 
Ser Ala Glu lie Lys Leu Glu Met Asn 
60 

Leu Arg Glu val Leu Lys Asp Tyr Phe 

75 80 
ser Lys Val Phe lie Asn Gin Lys Asp 
90 95 
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Leu Thr Leu lie Thr Lys His Phe Asn ser He Thr Ala Glu Asn Glu 

100 105 110 

Met Lys Pro Asp Ser Leu Leu Ala Gly lie Glu Asn Gly Lys Leu Lys 

115 120 125 

Phe Arg Phe Glu Thr Ala Asp Lys Tyr lie Glu Phe Ala Gin Gin Asn 

■ 130 135 140 

Gly Met val Val Arg Gly His Thr Leu Val Trp His Asn Gin Thr Pro 
145 150 155 160 

Glu Trp phe Phe Lys Asp Glu Asn Gly Asn Leu Leu Ser Lys Glu Ala 

165 170 175 

Met Thr Glu Arg Leu Arg Glu Tyr He His Thr val Val Gly His Phe 

180 185 190 

Lys Gly Lys Val Tyr Ala Trp Asp Val Val Asn Glu Ala Val Asp Pro 

195 200 205 

Asn Gin Pro Asp Gly Leu Arg Arg Ser Thr Trp Tyr Gin lie Met Gly 

210 215 220 

pro Asp Tyr He Glu Leu Ala Phe Lys Phe Ala Arg Glu Ala Asp Pro 
225 230 235 240 

Asp Ala Lys Leu Phe Tyr Asn Asp Tyr Asn Thr Phe Glu Pro Lys Lys 

245 250 255 

Arg Asp lie lie Tyr Asn Leu Val Lys ser Leu Lys Glu Lys Gly Leu 

260 265 270 

lie Asp Gly lie Gly Met Gin Cys His He Ser Leu Ala Thr Asp He 

275 280 285 

Arg Gin lie Glu Glu Ala lie Lys Lys Phe ser ser lie Pro Gly lie 

290 295 300 

Glu lie His lie Thr Glu Leu Asp Met Ser Val Tyr Arg Asp ser Thr 
305 310 315 320 

Ser Asn Tyr Pro Glu Ala Pro Arg Asn Ala Leu lie Glu Gin Ala His 

325 330 335 

Lys Met Ala Gin Leu Phe Glu He Phe Lys Lys Tyr Ser Asn Val He 

340 345 350 

Thr Asn val Thr Phe Trp Gly Leu Lys Asp Asp Tyr ser Trp Arg Ala 

355 360 365 

Thr Arg Arg Asn Asp Trp Thr Leu He Phe Asp Lys Asp Tyr Gin Ala 

370 375 380 

Lys Leu Ala Tyr Trp Ala lie Val Ala Pro Glu Val Leu Pro Pro Leu 
385 390 395 400 

ser Lys Glu ser Lys He Gin Arg lie Gin Lys Ala Ser Arg Glu Tyr 
405 410 415 

Phe 

<210> 87 
<211> 1089 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 87 

ttgaagaaca gaattaaaaa ggttgtgggc gggctcgccc tggcgagtgt tctgctcacc 60 

tcggtaatgg caggcaatgc cagcgcagca attaccaatg gatcgaagtt cctggggaat 120 

atcattgccg gcagtgctcc aagtaacttc accacctact ggaatcaggt caccccggag 180 

aacggcacca aatggggttc catcgaaggc aaccgcaacc agatgaactg gggaaacgcg 240 

gacatgatct ataactacgc catcagcaaa aacatcccgt tcaaattcca tactctcgtc 300 

tggggaagcc aggagcccaa ctgggtggcc ggcttgtcgg cagcggagca gaaggcggaa 360 

atcagctcat tcattactca agcaggacag cgttattccg cgaagacagc ttttgtggat 420 

gtagtcaatg aaccgctgca tgccaagcct tcgtaccgca atgccatcgg cggcgatggc 480 

agcaccggct gggattgggt gatctggtct ttccagcaag cccgggccgc cttcccgaac 540 

gccaagctgc acctcaatga ctacggcatt atcggtgacc ccagcgcggc cgataaatat 600 

gtgaacatta tcaatatcct gaaatccaga ggactgatcg atggtattgg tattcagtgc 660 

cactacttca atatggataa cgtaagtgtg agcaccatga atactgtact gggtaagctt 720 

gctgcaacag gcctgccaat ctatgtctcc gagctggata ttaccggtga tgacaacacc 780 

cagcttgcca gataccaaca gaaattccct gtgctctgga accatccttc cgtgaagggc 840 

gtcaccctgt ggggctacat ccaaaatcag acctgggcat caggcaccca tctggtgaat 900 

tccaacggca cagagcgccc tgccctgaag tggctgaagc aatacctggg cggctcgtca 960 

gctctgatgg aaaccacaga cgcccaagac ctcactatca ctgacagtct gatccagccg 1020 
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gacagtgtgg ttgagccgga ccctcaactg gatctccagc cggtgcttga gcccgttccg 1080 
gctgagtaa " 1089 

<210> 88 
<211> 362 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C29) 

<400> 88 

Leu Lys Asn Arg lie Lys Lys Val val Gly Gly Leu Ala Leu Ala Ser 
15 10 15 

Val Leu Leu Thr Ser Val Met Ala Gly Asn Ala Ser Ala Ala lie Thr 

20 25 30 

Asn Gly ser Lys Phe Leu Gly Asn lie lie Ala Gly Ser Ala Pro Ser 

35 40 45 

Asn Phe Thr Thr Tyr Trp Asn Gin Val Thr Pro Glu Asn Gly Thr Lys 

50 55 60 

Trp Gly Ser He Glu Gly Asn Arg Asn Gin Met Asn Trp Gly Asn Ala 
65 70 75 80 

Asp Met lie Tyr Asn Tyr Ala He Ser Lys Asn lie Pro Phe Lys Phe 

85 90 95 

His Thr Leu Val Trp Gly ser Gin Glu Pro Asn Trp Val Ala Gly Leu 

100 105 110 

Ser Ala Ala Glu Gin Lys Ala Glu He ser Ser Phe lie Thr Gin Ala 

115 120 125 

Gly Gin Arg Tyr ser Ala Lys Thr Ala Phe Val Asp val val Asn Glu 

130 135 140 

pro Leu His Ala Lys Pro ser Tyr Arg Asn Ala lie Gly Gly Asp Gly 
145 150 155 160 

Ser Thr Gly Trp Asp Trp val He Trp ser Phe Gin Gin Ala Arg Ala 
, 165 170 175 

Ala Phe Pro Asn Ala Lys Leu His Leu Asn Asp Tyr Gly lie lie Gly 

180 185 190 

Asp Pro ser Ala Ala Asp Lys Tyr Val Asn lie lie Asn lie Leu Lys 

195 200 205 

ser Arg Gly Leu lie Asp Gly lie Gly lie Gin Cys His Tyr Phe Asn 

210 215 220 

Met Asp Asn val Ser val ser Thr Met Asn Thr val Leu Gly Lys Leu 
225 230 235 240 

Ala Ala Thr Gly Leu Pro lie Tyr val Ser Glu Leu Asp lie Thr Gly 

245 250 255 

Asp Asp Asn Thr Gin Leu Ala Arg Tyr Gin Gin Lys Phe pro Val Leu 

260 265 270 

Trp Asn His Pro ser val Lys Gly Val Thr Leu Trp Gly Tyr lie Gin 

275 280 285 

Asn Gin Thr Trp Ala ser Gly Thr His Leu Val Asn ser Asn Gly Thr 

290 295 300 

Glu Arg Pro Ala Leu Lys Trp Leu Lys Gin Tyr Leu Gly Gly ser ser 
305 310 315 320 

Ala Leu Met Glu Thr Thr Asp Ala Gin Asp Leu Thr lie Thr Asp Ser 

, , 325 330 335 

Leu lie Gin Pro Asp ser val val Glu Pro Asp Pro Gin Leu Asp Leu 

, 340 345 350 

Gin Pro Val Leu Glu Pro val Pro Ala Glu 
355 360 

<210> 89 
<211> 2541 
<212> DNA 
<213> Bacteria 

<400> 89 

atggatacat tgttcaatac aaccgatgag cgtggggcgt ccaagcgccg tggcatcgtc 60 
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gcggcgcttg cggccgcagc catgctggtg ccgctggcgt tcgccccgac ggccatggcg 120 

gccgaccccg actatccggg cggcatcaag ggcgaataca atccgctggg aatcaacgct 180 

ggtgtcgcca tcgagacata caccctcaac caggacaagg agaaggccct ggtcgagaac 240 

ttcgaccaga tcaccccgga gaactcgctg aagccggaag gctggtacga cgaccagcat 300 

aatttccgca tgtcggatga cgcgcggaac ctgctgacgt tcgccagcga gaacggcatc 360 

aaggtctacg gccatgttct ggtctggcac tcgcagacgc ccgactggtt cttccaggcc 420 

gacgaatggt gccatgacac caacgacaac cccggcgtca ccagctgccc gcttgccgac 480 

aaggccacga tgcaggaacg ccagcgcagg cacatcgaga acgtggcgga ggccatctcc 540 

gacgaattcg gaaaattcgg cagcccgacc aatcccgtcg tcgcgttcga cgtggtcaac 600 

gagaccgtga acgacagcga cgaccccgcc accaacggca tgcgcaattc gctgtggtat 660 

cagacctatg ggggcgagga ctacatctat gacgcgttcc ggaacgcgaa tacgtatctg 720 

aacgacgtct acgccgccga cgacgcggag catccggtga cgttgttcat caacgattac 780 

ggcaccgagc aggcgggcaa gcgttcccgc tacaaggcgc ttctggaacg catgatccag 840 

cagggggttc cctttgacgg catcggtcac cagttccatg tgtcgttgac cacggcctcg 900 

tcgaatctcg acgacgcgct gaccgatatg tcctcgctcg gcaagaagca ggccatcacc 960 

gaactggacg tcgccaccgg aacgccggtt acggaggcga agctcatcga gcagggacgg 1020 

tactactacg acgtcaacca gatcatccac aggcacgccg accagctgtt ctcggtttcg 1080 

gtgtgggggc tgagcgacga ccagtcctgg cgcaacaagg agggcgcgcc gctgctgttc 1140 

gacgacaacc tggagaagaa gccggcgtac atcggctaca tcggtgatag cgccaacctt 1200 

cccgagccgt tgaagagcat gaacgcattc aaggatgacg ccgtgggcat cgactcggcg 1260 

cttcccggta ccgtggccga gtccggcgcg tcctctccgt gggaacgtct ttcgctggtc 1320 

gagatgaccc cgtctgcgta tgacgccgtt tccggctcgt tcaatgtcta ttggaaggac 1380 

ggctctctgg tcgtctacgc ggatgtcgcc gatgccagcg cggcggatga cgacaccgtc 1440 

accgtgcgtg tgggtgacgc cgagtatacg atcggccgca acggtgtgac cggcggcgag 1500 

ggtgtgcagg ccaacgtcgt ttcgtctgat gccggatacg aagtcgtggc cgatatcccg 1560 

tacaccggtg cagagaagga catcgtcgag atgaacgtca tcgcgacgga ttccgccacc 1620 

acggagacca gcgcgtggag cacgaacgac actggcgccg tcacgctggc cgagccgctg 1680 

agctacacgg aagccgtgaa ggttcccgcc gacgcccagg ctccggtcgt tgacgccgac 1740 

ccgtcggatt ccgtctgggc ggaagccaac gaggttcccg tgggtaaggt gaccgccgcc 1800 

acgccttccc ccgaggcgac cgctaccgcc aagaccctgt ggtcggacgg caagctgtat 1860 

gtcctcatgg aagtgaccga cgcggacatc gatctgacca actcgaatcc gtgggagaag 1920 

gactccgttg aggtgtacat cgaccgtggc aacaccaaga gcggccagta taccaacgac 1980 

atccagcaga ttcgcgtgtc cgccgatggt gcggagctga gcttcggctc cggcgcgtcg 2040 

gaggatgtcc agaagtccat ggtccagacc gccggcaagc tcgtcgatgg cggctatgtc 2100 

gtcgagatgg ccatcgatct gggaacggct gaggccggca ccttcgaagg tgtcgacttc 2160 

cagatcaacg acgcgaagaa cggtgctcga atcggcatcc gcaactgggc cgatccgacc 2220 

ggtgccggct atcagacggc gtcccattgg ggcgtgctgc gtctgctggc cgatccctcc 2280 

gaaaccgaga ccccgggtgg agaagatccc gagacccccg gtgacgagga gactcctggc 2340 

gaggataccg agaagcctgg cgacgaggaa acccccggtg aggataccga gaagcctggc 2400 

gacgagaagc cgcggccttc cgacgatgct gacaacgacg acaagatgcc gcagaccggt 2460 

tccgcggtca tcggaatcgc cgtggtggcg ctgctgctgg ttgccgccgg atgcgggctg 2520 

gtcatcgctc ggcgtcgatg a 2541 

<210> 90 

<211> 846 

<212> PRT 

<213> Bacteria 

<220> 

<221> SIGNAL 
<222> (1)...(40) 

<400> 90 

Met Asp Thr Leu Phe Asn Thr Thr Asp Glu Arg Gly Ala Ser Lys Arg 

1 5 10 15 

Arg Gly lie val Ala Ala Leu Ala Ala Ala Ala Met Leu Val Pro Leu 

20 25 30 

Ala phe Ala Pro Thr Ala Met Ala Ala Asp Pro Asp Tyr Pro Gly Gly 

35 40 45 

lie Lys Gly Glu Tyr Asn Pro Leu Gly lie Asn Ala Gly Val Ala lie 

50 55 60 

Glu Thr Tyr Thr Leu Asn Gin Asp Lys Glu Lys Ala Leu val Glu Asn 
65 70 75 80 

Phe Asp Gin He Thr Pro Glu Asn Ser Leu Lys Pro Glu Gly Trp Tyr 

85 90 95 

Asp Asp Gin His Asn Phe Arg Met ser Asp Asp Ala Arg Asn Leu Leu 

100 105 110 

Thr Phe Ala Ser Glu Asn Gly He Lys val Tyr Gly His val Leu Val 
115 120 125 
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Trp His ser Gin Thr pro Asp Trp Phe Phe Gin Ala Asp Glu Trp cys 

130 135 140 

His Asp Thr Asn Asp Asn Pro Gly Val Thr ser cys Pro Leu Ala Asp 
145 150 155 160 

Lys Ala Thr Met Gin Glu Arg Gin Arg Arg His lie Glu Asn val Ala 

165 170 175 

Glu Ala He Ser Asp Glu Phe Gly Lys Phe Gly Ser Pro Thr Asn Pro 

180 185 190 

val val Ala Phe Asp val val Asn Glu Thr val Asn Asp ser Asp Asp 

195 200 205 

Pro Ala Thr Asn Gly Met Arg Asn Ser Leu Trp Tyr Gin Thr Tyr Gly 

210 215 220 

Gly Glu Asp Tyr lie Tyr Asp Ala Phe Arg Asn Ala Asn Thr Tyr Leu 
225 230 235 240 

Asn Asp val Tyr Ala Ala Asp Asp Ala Glu His Pro Val Thr Leu Phe 

245 250 255 

lie Asn Asp Tyr Gly Thr Glu Gin Ala Gly Lys Arg ser Arg Tyr Lys 

260 265 270 

Ala Leu Leu Glu Arg Met lie Gin Gin Gly Val Pro Phe Asp Gly lie 

275 280 285 

Gly His Gin Phe His Val Ser Leu Thr Thr Ala Ser Ser Asn Leu Asp 

290 295 300 

Asp Ala Leu Thr Asp Met ser Ser Leu Gly Lys Lys Gin Ala lie Thr 
305 310 315 320 

Glu Leu Asp val Ala Thr Gly Thr Pro val Thr Glu Ala Lys Leu lie 

325 330 335 

Glu Gin Gly Arg Tyr Tyr Tyr Asp Val Asn Gin lie lie His Arg His 

340 345 350 

Ala Asp Gin Leu Phe ser val Ser Val Trp Gly Leu Ser Asp Asp Gin 

355 360 365 

ser Trp Arg Asn Lys Glu Gly Ala Pro Leu Leu Phe Asp Asp Asn Leu 

370 375 380 

Glu Lys Lys Pro Ala Tyr lie Gly Tyr lie Gly Asp Ser Ala Asn Leu 
385 390 395 400 

Pro Glu Pro Leu Lys Ser Met Asn Ala Phe Lys Asp Asp Ala val Gly 

405 410 415 

lie Asp ser Ala Leu Pro Gly Thr Val Ala Glu Ser Gly Ala ser Ser 

420 425 430 

Pro Trp Glu Arg Leu ser Leu val Glu Met Thr Pro Ser Ala Tyr Asp 

435 440 445 

Ala val ser Gly ser phe Asn Val Tyr Trp Lys Asp Gly ser Leu val 

450 455 460 

Val Tyr Ala Asp Val Ala Asp Ala Ser Ala Ala Asp Asp Asp Thr Val 
465 470 475 480 

Thr Val Arg val Gly Asp Ala Glu Tyr Thr He Gly Arg Asn Gly Val 

485 490 495 

Thr Gly Gly Glu Gly val Gin Ala Asn val val ser Ser Asp Ala Gly 

500 505 510 

Tyr Glu val val Ala Asp lie Pro Tyr Thr Gly Ala Glu Lys Asp He 

515 520 525 

Val Glu Met Asn Val He Ala Thr Asp Ser Ala Thr Thr Glu Thr ser 

530 535 540 

Ala Trp ser Thr Asn Asp Thr Gly Ala Val Thr Leu Ala Glu Pro Leu 
545 550 555 560 

ser Tyr Thr Glu Ala Val Lys val Pro Ala Asp Ala Gin Ala Pro Val 

565 570 575 

val Asp Ala Asp Pro ser Asp Ser Val Trp Ala Glu Ala Asn Glu val 

580 585 590 

Pro Val Gly Lys Val Thr Ala Ala Thr Pro ser Pro Glu Ala Thr Ala 

595 600 605 

Thr Ala Lys Thr Leu Trp Ser Asp Gly Lys Leu Tyr Val Leu Met Glu 

610 615 620 

Val Thr Asp Ala Asp lie Asp Leu Thr Asn ser Asn Pro Trp Glu Lys 
625 630 635 640 

Asp ser Val Glu val Tyr He Asp Arg Gly Asn Thr Lys ser Gly Gin 

645 650 655 

Tyr Thr Asn Asp He Gin Gin lie Arg val ser Ala Asp Gly Ala Glu 

660 665 670 

Leu ser Phe Gly ser Gly Ala ser Glu Asp val Gin Lys ser Met val 
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675 680 . 685 

Gin Thr Ala Gly Lys Leu val Asp Gly Gly Tyr Val val Glu Met Ala 

690 695 700 

lie Asp Leu Gly Thr Ala Glu Ala Gly Thr Phe Glu Gly Val Asp Phe 
705 710 715 720 

Gin He Asn Asp Ala Lys Asn Gly Ala Arg lie Gly lie Arg Asn Trp 

725 730 735 

Ala Asp Pro Thr Gly Ala Gly Tyr Gin Thr Ala ser His Trp Gly Val 

740 745 750 

Leu Arg Leu Leu Ala Asp Pro ser Glu Thr Glu Thr Pro Gly Gly Glu 

755 760 765 

Asp Pro Glu Thr Pro Gly Asp Glu Glu Thr Pro Gly Glu Asp Thr Glu 

770 775 780 

Lys Pro Gly Asp Glu Glu Thr Pro Gly Glu Asp Thr Glu Lys Pro Gly 
785 790 795 800 

Asp Glu Lys Pro Arg Pro Ser Asp Asp Ala Asp Asn Asp Asp Lys Met 

805 810 815 

Pro Gin Thr Gly Ser Ala Val lie Gly He Ala Val Val Ala Leu Leu 

820 825 830 

Leu Val Ala Ala Gly cys Gly Leu val lie Ala Arg Arg Arg 
835 840 845 

<210> 91 
<211> 1023 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 91 

atgaatgtat cggtaccggc cgagtccgca cttaaagaca tcttcgcgga agacttccat 60 

ataggtgcgg cggtcagtag taatacgatc aagtcgcagg agagtctgct tacgcatcac 120 

tttaacagca ttacggcgga aaacgaaatg aagttcgcca gcgtccatcc agaggaagag 180 

ctttacacct tcgaggaagc ggatcagatc gtggacttcg cgcgcaaaca cgggatggct 240 

gtccgcggac atacgctggt atggcataac cagaccaccg attggttgtt ccgcgacaag 300 

cagaatcagc tcgtgagcaa agccgtgctt tatgaaagaa tccgttcgca tatccaaacg 360 

gtagtaggca gatataaggg cgatatttac gcttgggacg ttgtgaacga ggtcattgcc 420 

gatgacggcg atcagttgct gcgtacctcc agctggacgg aaatcgccgg ggacgaattc 480 

atcgccaaag cgtttgaata cgcgcatgct gccgacccga atgcgctgtt gttctacaac 540 

gactacaatg agtcccatcc aagcaaacgg gataaaattt ataccttggt caagtctctt 600 

ctggaccggg gagtacctat tcacggcatt ggcctgcagg cacactggaa tctgttcaac 660 

ccgtccttgg atgacatccg ggcagccatc gaaaaatatg cttcgctagg attgcagctc 720 

cagctcacgg aactggatgt gtcggtattc cgtttcgaag ataagcgggc cgatctgacc 780 

gagcctgaac cgggaatgct ggaacagcag gctgaattct acgaagccgt gttcaagctg 840 

cttaaggaat acagcgatgt aattagcgcg gtgacgttct ggggagctgc ggacgaccac 900 

acctggctca gcgattttcc ggtacgtggg cgcaaaaact ggccgctgct gttcgatgag 960 

cggcacaggc cgaagccggc atattatcgc ttagctgctc ttgccaatca tcttcggcgt 1020 

tga " 1023 

<210> 92 
<211> 340 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 92 

Met Asn Val Ser Val Pro Ala Glu ser Ala Leu Lys Asp lie Phe Ala 

1 5 10 15 

Glu Asp Phe His He Gly Ala Ala Val Ser ser Asn Thr lie Lys Ser 

20 25 30 

Gin Glu ser Leu Leu Thr His His phe Asn ser lie Thr Ala Glu Asn 

35 40 45 

Glu Met Lys Phe Ala ser Val His pro Glu Glu Glu Leu Tyr Thr Phe 

50 55 60 

Glu Glu Ala Asp Gin lie Val Asp Phe Ala Arg Lys His Gly Met Ala 
65 70 75 80 
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Val Arg Gly His Thr Leu 
85 

Phe Arg Asp Lys Gin Asn 
100 

Arg lie Arg Ser His lie 
115 

lie Tyr Ala Trp Asp Val 
130 

Gin Leu Leu Arg Thr Ser 
145 150 
lie Ala Lys Ala Phe Glu 
165 

Leu Phe Tyr Asn Asp Tyr 
180 

lie Tyr Thr Leu Val Lys 
195 

Gly lie Gly Leu Gin Ala 
210 

Asp lie Arg Ala Ala lie 
225 230 
Gin Leu Thr Glu Leu Asp 
245 

Ala Asp Leu Thr Glu Pro 
260 

Phe Tyr Glu Ala val Phe 
275 

ser Ala Val Thr Phe Trp 
290 

Asp Phe Pro Val Arg Gly 
305 310 
Arg His Arg Pro Lys Pro 
325 

His Leu Arg Arg 
340 

<210> 93 
<211> 1011 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 93 

atgaatcaat cagtaaatga agcacaggtt cctgcattat cggatgtata tgaagattat 60 

ttttcaatag gtgccgctgt taatccactt actttaggta cgcaaaaaaa gctgttaacc 120 

aaacatttta atagtataac ggctgagaat gaaatgaaat ttgaagcatt acagcctaaa 180 

ccagatcaat ttacatttga tacggcggat aaaatggttg cctttgccca agcacatgat 240 

atgaagatgc gtggccatac attaatctgg cacaatcaaa caccagattg gatgtttttg 300 

caaaaagacg gtacgacaat tgatcgtgaa acactcttgg agagaatgaa aaaacatatt 360 

aagacggtgg tggaaagata taaaggcaaa atatattgtt gggacgttgt aaatgaagcg 420 

gtagctgatg aaggcgaagc tattttaaga ccatcaaaat ggacggacat tattggcgac 480 

tcgtttattg agtatgcttt taaatacgcc cacgaggccg atcccgatgc actgttgttt 540 

tacaatgact acaatgcttg ccaccctcat aaaagagata agatttatca acttgtaaag 600 

gggttaatag acaagggtgt gcccatacac ggtattggcc tacaagcaca ttggaacatt 660 

gttgacccgt cttacgatga tattaaacga gccatcgaaa cttatgcatc attaggatta 720 

agcatacact ttactgaaat ggatgtgtct gtttttgaat atcatgatcg aagaacagac 780 

ttattggaac ctacaaaaga tatggtttca cgtcaagctg agcgttatca ggcatttttt 840 

gaaatattta ggtcgtatgc tgatgtgatt gattccgtta cgttttgggg catggccgat 900 

gattatacat ggcttgatga ttttccggtg acaggtcgaa aaaattggcc ctttgtattt 960 

gatgcgagac atcagcctaa aacagcattc tggaacatcg ttgattttta a 1011 

<210> 94 
<211> 336 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
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Val Trp 

Gin Leu 

Gin Thr 
120 
Val Asn 
135 

Ser Trp 

Tyr Ala 

Asn Glu 

Ser Leu 
200 
His Trp 
215 

Glu Lys 

val ser 

Glu Pro 

Lys Leu 
280 
Gly Ala 
295 

Arg Lys 
Ala Tyr 



His Asn 
90 

val ser 
105 

Val val 

Glu Val 

Thr Glu 

His Ala 
170 
Ser His 
185 

Leu Asp 

Asn Leu 

Tyr Ala 

val Phe 
250 
Gly Met 
265 

Leu Lys 

Ala Asp 

Asn Trp 

Tyr Arg 
330 



Gin Thr Thr 

Lys Ala val 

Gly Arg Tyr 
125 

lie Ala Asp 

140 
He Ala Gly 
155 

Ala Asp Pro 

pro Ser Lys 

Arg Gly Val 
205 

Phe Asn Pro 

220 
ser Leu Gly 
235 

Arg Phe Glu 

Leu Glu Gin 

Glu Tyr ser 
285 

Asp His Thr 

300 
Pro Leu Leu 
315 

Leu Ala Ala 



Asp Trp Leu 
95 

Leu Tyr Glu 
110 

Lys Gly Asp 

Asp Gly Asp 

Asp Glu Phe 
160 

Asn Ala Leu 

175 
Arg Asp Lys 
190 

Pro lie His 

Ser Leu Asp 

Leu Gin Leu 
240 

Asp Lys Arg 

255 
Gin Ala Glu 
270 

Asp Val He 

Trp Leu Ser 

Phe Asp Glu 
320 

Leu Ala Asn 
335 
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<400> 94 

Met Asn Gin Ser Val Asn Glu Ala Gin val Pro Ala Leu ser Asp val 

1 5 10 15 

Tyr Glu Asp Tyr Phe Ser He Gly Ala Ala Val Asn Pro Leu Thr Leu 

, L 20 25 30 

Gly Thr Gin Lys Lys Leu Leu Thr Lys His Phe Asn Ser He Thr Ala 

35 40 45 

Glu Asn Glu Met Lys Phe Glu Ala Leu Gin Pro Lys Pro Asp Gin Phe 

50 55 60 

Thr Phe Asp Thr Ala Asp Lys Met Val Ala Phe Ala Gin Ala His Asp 
65 70 75 80 

Met Lys Met Arg Gly His Thr Leu lie Trp His Asn Gin Thr Pro Asp 

85 90 95 

Trp Met Phe Leu Gin Lys Asp Gly Thr Thr He Asp Arg Glu Thr Leu 

100 105 110 

Leu Glu Arg Met Lys Lys His lie Lys Thr Val Val Glu Arg Tyr Lys 

115 120 125 

Gly Lys lie Tyr Cys Trp Asp val Val Asn Glu Ala Val Ala Asp Glu 

130 135 140 

Gly Glu Ala lie Leu Arg Pro Ser Lys Trp Thr Asp lie lie Gly Asp 
145 150 155 160 

Ser Phe lie Glu Tyr Ala Phe Lys Tyr Ala His Glu Ala Asp Pro Asp 

165 170 175 

Ala Leu Leu Phe Tyr Asn Asp Tyr Asn Ala cys His Pro His Lys Arq 

180 185 190 

Asp Lys lie Tyr Gin Leu Val Lys Gly Leu lie Asp Lys Gly Val pro 

. 195 200 205 

lie His Gly He Gly Leu Gin Ala His Trp Asn lie val Asp Pro Ser 

210 215 220 

Tyr Asp Asp lie Lys Arg Ala lie Glu Thr Tyr Ala Ser Leu Gly Leu 
225 230 235 240 

Ser lie His Phe Thr Glu Met Asp val Ser val Phe Glu Tyr His Asp 

245 250 255 

Arg Arg Thr Asp Leu Leu Glu Pro Thr Lys Asp Met val Ser Arg Gin 

260 265 270 

Ala Glu Arg Tyr Gin Ala Phe Phe Glu He Phe Arg Ser Tyr Ala Asp 

275 280 285 

val lie Asp ser Val Thr phe Trp Gly Met Ala Asp Asp Tyr Thr Trp 

290 295 300 

Leu Asp Asp Phe Pro Val Thr Gly Arg Lys Asn Trp Pro Phe val Phe 
305 310 315 320 

Asp Ala Arg His Gin Pro Lys Thr Ala Phe Trp Asn lie val Asp Phe 
325 330 335 

<210> 95 
<211> 1143 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 95 

atgaaaaaaa cgattgcaca tttcacctta tggatagtgt tttttctctt cacttcctgt 60 

gctgttacgg cgcagaagaa tgctaagaat acaagagtaa aactcactac cctaaaagag 120 

gcttaccaag gtaaattcta tatcggtact gcgatgaatc tgagacagat tcacggagat 180 

gatccccagt ctgaaaatat tatcaaaaaa cagttcaatt ccatagttgc cgaaaactgc 240 

atgaagagta tgtatcttca gccggaggaa ggaaaatttt tcttcgatga tgcggacaag 300 

tttgtggatt ttggtcttca gaacaatatg ttcatcatcg ggcattgtct gatttggcat 360 

tcgcaggcgc caaaatggtt tttcaccgat gagaatggaa aaacggtttc cccagaagtt 420 

cttaaacaaa ggatgaaagc ccatatcacc gctgtcgttt cccgctacaa agggaaaatc 480 

aaaggttggg atgtggtgaa cgaagccatt atggaagatg gttcttaccg caaaagcaaa 540 

ttttatgaga ttttgggaga agaatttatt ccgttggcat ttcagtatgc gcatgaagca 600 

gatcctgatg cagaacttta ttacaacgat tataacgaat ggtatcccgg aaaaagagct 660 

acggtgacca agataatccg cgatttcaaa tctagaggaa tccgcattga tgccatcgga 720 

atgcaggctc atttcgggat ggattcgccc actttagaag agtatgaaca aaccattcag 780 

ggctatataa aagaaggcgt gaaagtcaat attacggaac tcgatttgag tccgcttcct 840 

tctccttggg gaacttccgc caatgttgcc gatacgcagc agtatcagga aaaaatgaat 900 
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ccttacacca aaggacttcc cgccgatgtg gaaaaagcat gggaaaaccg ctatctcgat 960 

tttttcaaac tgttcctgaa atatcatcag catatcgagc gtgttacgtt ttggggcgtt 1020 

agcgatatcg attcctggaa gaacgatttt ccagtaagag gacgtaccga ttatccacta 1080 

ccgtttaacc gacagtatca ggcaaaacct ttggtgcaga aattaataga cttaacgaaa 1140 

tag 1143 

<210> 96 
<211> 380 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (I)... (24) 

<400> 96 

Met Lys Lys Thr lie Ala His Phe Thr Leu Trp He val Phe Phe Leu 
15 10 15 

Phe Thr Ser Cys Ala Val Thr Ala Gin Lys Asn Ala Lys Asn Thr Arg 

20 25 30 

val Lys Leu Thr Thr Leu Lys Glu Ala Tyr Gin Gly Lys Phe Tyr lie 

35 40 45 

Gly Thr Ala Met Asn Leu Arg Gin lie His Gly Asp Asp Pro Gin Ser 

50 55 60 

Glu Asn lie He Lys Lys Gin Phe Asn Ser He Val Ala Glu Asn Cys 
65 70 75 80 

Met Lys Ser Met Tyr Leu Gin Pro Glu Glu Gly Lys Phe Phe Phe Asp 

85 90 95 

Asp Ala Asp Lys Phe Val Asp Phe Gly Leu Gin Asn Asn Met Phe lie 

100 105 110 

lie Gly His Cys Leu lie Trp His ser Gin Ala Pro Lys Trp Phe Phe 

115 120 125 

Thr Asp Glu Asn Gly Lys Thr Val Ser Pro Glu Val Leu Lys Gin Arg 

130 135 140 

Met Lys Ala His lie Thr Ala Val val ser Arg Tyr Lys Gly Lys lie 
145 150 155 160 

Lys Gly Trp Asp val Val Asn Glu Ala lie Met Glu Asp Gly Ser Tyr 

165 170 175 

Arg Lys ser Lys Phe Tyr Glu lie Leu Gly Glu Glu Phe lie Pro Leu 

180 185 190 

Ala Phe Gin Tyr Ala His Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr 

195 200 205 

Asn Asp Tyr Asn Glu Trp Tyr Pro Gly Lys Arg Ala Thr Val Thr Lys 

210 215 220 

lie lie Arg Asp Phe Lys Ser Arg Gly lie Arg lie Asp Ala lie Gly 
225 230 235 240 

Met Gin Ala His Phe Gly Met Asp Ser Pro Thr Leu Glu Glu Tyr Glu 

245 250 255 

Gin Thr He Gin Gly Tyr lie Lys Glu Gly Val Lys val Asn lie Thr 

260 265 270 

Glu Leu Asp Leu Ser Pro Leu Pro Ser Pro Trp Gly Thr Ser Ala Asn 

275 280 285 

Val Ala Asp Thr Gin Gin Tyr Gin Glu Lys Met Asn Pro Tyr Thr Lys 

290 295 300 

Gly Leu Pro Ala Asp val Glu Lys Ala Trp Glu Asn Arg Tyr Leu Asp 
305 310 315 320 

Phe Phe Lys Leu Phe Leu Lys Tyr His Gin His He Glu Arg Val Thr 

325 330 335 

Phe Trp Gly Val ser Asp lie Asp ser Trp Lys Asn Asp Phe Pro val 

340 345 350 

Arg Gly Arg Thr Asp Tyr Pro Leu Pro Phe Asn Arg Gin Tyr Gin Ala 

355 360 365 

Lys Pro Leu Val Gin Lys Leu lie Asp Leu Thr Lys 
370 375 380 

<210> 97 
<211> 1407 
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<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 



<400> 97 

atgaatgaaa 

tccaacattc 

tcaaccctga 

ggctttctcc 

gaacgccgtt 

agtctcctct 

cgccaggact 

gtcgagtcgg 

tgggaacgag 

gagttcggcc 

acgcccggct 

gagcggatgc 

tgggatgtgg 

cggatcatcg 

gatgcggagc 

gtggacctgg 

gggcactaca 

gcggagctgg 

ggccagtcgg 

tggaatcctt 

gctgaaatct 

ggcgtcaccg 

ccgttgctct 

cgtcagccgc 



cctcgcggaa 
agcccagggt 
aagggttgca 
cttttccacc 
ggcgttcccg 
cgggtctgct 
tcctgctggg 
tattgatcga 
tccatcctca 
gcaaacacgg 
gggtcttccg 
gcgaccacat 
tgaacgaggc 
gcgacgatta 
tctattacaa 
tgaagcagct 
acctcgactg 
ggctcaaggt 
gcgaagccga 
tcacgaacgg 
tcaggatctt 
accggacctc 
ttgatcgggc 
gccagcccgt 



ttggttggag 
tggcgcttgc 
gcggaggttt 
cagggtagcg 
gaaacctgcg 
gtggggcgcc 
ggcggcgttg 
aaagcatttc 
gcccaaccag 
aatggtcatc 
ggatgccgac 
ccacaccgtg 
gctgcgcgac 
cattttgaaa 
cgattattcg 
ccaggccggc 
gccggagacc 
gatgatcacg 
tgtagggatg 
actgccggcc 
cacgaagcac 
ctggctcaac 
tggggagccc 
cgaatga 



agaggattgc 
gcctaccctg 
gcacaagacc 
cctgcggcgc 
aagaaacaac 
gaagtgcaac 
aacgcggagc 
aacaegatca 
tattcttttg 
atcggccaca 
ggaaagacgc 
gtcgggcgct 
gacggcgcgt 
gccttccagt 
ctggagaagc 
ggggcgaagc 
gccgagatcg 
gagctggacg 
acgttcggcg 
gcagtggagc 
agccgtcgga 
aattttccca 
aaacccgcgt 



ctttcgaacg 
ggttggaagc 
gatacaaccc 
aaccctgggc 
tcgccttcct 
cggcactgaa 
aggtgctgga 
cgcccgagaa 
aggacgcgga 
cgctggtctg 
tgacgcgcga 
acaagggcaa 
ggcggaattc 
atgcccatga 
cggccaagcg 
tggccggcgt 
aaaacaccat 
tcaacgcgct 
gcaatttcgg 
aacgcctcgc 
tttcgcgcgt 
tccgcggccg 
tccgatccgt 



ccaacggcgt 
aatcgctcca 
tttcaggatt 
tgatggatca 
ggccatcacc 
agacgtattc 
caccaaccgg 
tgtgctgaag 
tcgctacgtc 
gcacagccag 
agccctgctg 
gatccgcggc 
ccaatggcgg 
ggccgatccg 
caatggcgcc 
cggcttgcag 
cgcggcgttc 
gccgacgccc 
cggcgataaa 
ggaccgctac 
caccttctgg 
gaccaattac 
cgtggcggtc 



an environmental sample 



Arg Asn Trp Leu 

Asn lie Gin Pro 
25 

lie Ala Pro Ser 
40 

Asn Pro 



<210> 98 
<211> 468 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 
<400> 98 

Met Asn Glu Thr Ser 

1 5 
Arg Gin Arg Arg Ser 
20 

Pro Gly Leu Glu Ala 
35 

Arg Phe Ala Gin Asp Arg Tyr 

50 55 
Phe pro pro Arg val Ala Pro Ala Ala 
65 70 
Glu Arg Arg Trp Arg Ser Arg Lys Pro 
85 

Leu Ala lie Thr Ser Leu Leu ser Gly 
100 105 
Gin Pro Ala Leu Lys Asp Val Phe Arg 

115 120 
Ala Leu Asn Ala Glu Gin Val Leu Asp 

130 135 
Leu lie Glu Lys His Phe Asn Thr lie 
145 150 
Trp Glu Arg Val His Pro Gin Pro Asn 
165 

Asp Arg Tyr val Glu Phe Gly Arg Lys 
180 185 
His Thr Leu val Trp His ser Gin Thr 

195 200 
Ala Asp Gly Lys Thr Leu Thr Arg Glu 
210 215 



Glu Arg Gly 
10 

Arg val Gly 

Thr Leu Lys 

Phe Arg lie 
60 

Glri Pro Trp 
75 

Ala Lys Lys 
90 

Leu Leu Trp 

Gin Asp Phe 

Thr Asn Arg 
140 

Thr Pro Glu 

155 
Gin Tyr Ser 
170 

His Gly Met 



Leu Pro 

Ala Cys 
30 

Gly Leu 
45 

Gly Phe 

Ala Asp 

Gin Leu 

Gly Ala 
110 
Leu Leu 
125 

Val Glu 
Asn val 
Phe Glu 



Val He 
190 

Pro Gly Trp val phe 
205 

Ala Leu Leu Glu Arg 
220 
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Phe Glu 
15 

Ala Tyr 

Gin Arg 

Leu Pro 

Gly Ser 
80 

Ala Phe 
95 

Glu val 

Gly Ala 

Ser Val 

Leu Lys 
160 
Asp Ala 
175 

lie Gly 
Arg Asp 
Met Arg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1407 
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Asp His lie His Thr val Val Gly Arg Tyr Lys Gly Lys He Arq Gly 
225 230 235 240 

Trp Asp Val Val Asn Glu Ala Leu Arg Asp Asp Gly Ala Trp Arq Asn 

245 250 255 

ser Gin Trp Arg Arg He lie Gly Asp Asp Tyr lie Leu Lys Ala Phe 

260 265 270 

Gin Tyr Ala His Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr Asn Asp 

275 280 285 

Tyr ser Leu Glu Lys Pro Ala Lys Arg Asn Gly Ala Val Asp Leu Val 

290 295 300 

Lys Gin Leu Gin Ala Gly Gly Ala Lys Leu Ala Gly Val Gly Leu Gin 
305 310 315 320 

Gly His Tyr Asn Leu Asp Trp Pro Glu Thr Ala Glu lie Glu Asn Thr 

325 330 335 

He Ala Ala Phe Ala Glu Leu Gly Leu Lys val Met He Thr Glu Leu 

340 345 350 

Asp val Asn Ala Leu Pro Thr Pro Gly Gin Ser Gly Glu Ala Asp Val 

355 360 365 

Gly Met Thr Phe Gly Gly Asn Phe Gly Gly Asp Lys Trp Asn Pro Phe 

370 375 380 

Thr Asn Gly Leu Pro Ala Ala Val Glu Gin Arg Leu Ala Asp Arg Tyr 
385 390 395 400 

Ala Glu lie Phe Arg lie Phe Thr Lys His ser Arg Arg lie Ser Arg 

405 410 415 

Val Thr Phe Trp Gly val Thr Asp Arg Thr ser Trp Leu Asn Asn Phe 

420 425 430 

Pro lie Arg Gly Arg Thr Asn Tyr Pro Leu Leu Phe Asp Arg Ala Gly 

435 440 445 

Glu Pro Lys Pro Ala Phe Arg Ser Val Val Ala Val Arg Gin Pro Arg 

450 455 460 

Gin Pro Val Glu 
465 

<210> 99 
<211> 1074 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 99 

gtgcgctcaa gagctagcgc gtactggttc ggcgtggggt tggtggtggc gctgagcctg 60 

gctcagaccc cttcccccca gtccctgcgc gcgctggccg agcgccaggg gctgctggtg 120 

ggagccgcgg tggacctagc ggccctgtac gaccccctcg agcccgagta cgcccaactc 180 

ctcgcccgcg agttcaacct ggtggtggcc gagaacgcca tgaagtgggc ctccctgagc 240 

aacgcgcggg ggcagtacag cttcaccggc gctgacgccc tggtgcgctt cgcccgccag 300 

cacggccagc gcttgcgcgg ccacaccctc atctggcacg agcaactgcc cgcgtgggtg 360 

cgcagcggca ccttctcccg cgaggccatg ctggcggtga tgcaggagca cattcaggcg 420 

gtggccgggc acttccgcgg ccaggtggcc tactgggacg tggtcaacga ggcggtgagt 480 

gaccggggcg gcctgcgcga gacccccttt ctgcgggcgg tgggccccga ctacctcgag 540 

cacgccttcc gcttcgcccg cgccgccgac ccccaggcca agctcttcta caacgactac 600 

ggcgccgacg gcatgggcgc taaatcggac gagatctacg ccctgctcaa agcgctcaag 660 

gccaaggggg tacccgtcga cggggtgggc ttccaggccc acctcgacag caccttctcg 720 

gtccagcagg cgcggatgcg ggagaaccta gagacgcttc gccgacctgg gcctcgaggt 780 

gcacatcacc gagctggacg tgcagctaaa aggggcgggc tcgcgggagg aacggctgga 840 

ggcgcaggcc cggatctacg ccgaggtgct ggcgacctgc cgcgcggtcc gcggctgcag 900 

cgccgtgacg ctgtggggct tcaccgacgc ccactcctgg cgagccgccg ccgaacccct 960 

gatcttcgac gcgctctacc ggcccaaacc ggcgtaccag gctctgctgc gggctctggg 1020 

aggcaaccct tgagcctttt cagcccagtt ttgccaacga ggacagcact atga 1074 

<210> 100 
<211> 357 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
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<221> SIGNAL 
<222> (1)... (33) 

<400> 100 .... 
val Arg ser Arg Ala Ser Ala Tyr Trp Phe Gly Val Gly Leu val Val 

1 5 10 15 

Ala Leu ser Leu Ala Gin Thr Pro Ser Pro Gin Ser Leu Arg Ala Leu 

20 25 30 

Ala Glu Arg Gin Gly Leu Leu Val Gly Ala Ala val Asp Leu Ala Ala 

35 40 45 

Leu Tyr Asp Pro Leu Glu Pro Glu Tyr Ala Gin Leu Leu Ala Arg Glu 

50 55 60 

Phe Asn Leu val Val Ala Glu Asn Ala Met Lys Trp Ala Ser Leu ser 
65 70 75 80 

Asn Ala Arg Gly Gin Tyr ser Phe Thr Gly Ala Asp Ala Leu Val Arg 

85 90 95 

Phe Ala Arg Gin His Gly Gin Arg Leu Arg Gly His Thr Leu He Trp 

100 105 ~ 110 

His Glu Gin Leu Pro Ala Trp val Arg Ser Gly Thr Phe ser Arg Glu 

115 120 125 

Ala Met Leu Ala val Met Gin Glu His lie Gin Ala Val Ala Gly His 

130 135 140 

Phe Arg Gly Gin Val Ala Tyr Trp Asp Val Val Asn Glu Ala Val Ser 
145 150 155 160 

Asp Arg Gly Gly Leu Arg Glu Thr Pro Phe Leu Arg Ala val Gly pro 

165 170 175 

Asp Tyr Leu Glu His Ala Phe Arg Phe Ala Arg Ala Ala Asp Pro Gin 

180 185 ~ 190 

Ala Lys Leu Phe Tyr Asn Asp Tyr Gly Ala Asp Gly Met Gly Ala Lys 

195 200 205 

ser Asp Glu lie Tyr Ala Leu Leu Lys Ala Leu Lys Ala Lys Gly Val 

210 215 220 

pro Val Asp Gly val Gly Phe Gin Ala His Leu Asp ser Thr Phe ser 
225 230 235 240 

val Gin Gin Ala Arg Met Arg Glu Asn Leu Glu Thr Leu Arg Arg Pro 

245 " 250 255 

Gly Pro Arg Gly Ala His His Arg Ala Gly Arg Ala Ala Lys Arg Gly 

260 265 270 

Gly Leu Ala Gly Gly Thr Ala Gly Gly Ala Gly Pro Asp Leu Arg Arg 

275 280 285 

Gly Ala Gly Asp Leu Pro Arg Gly Pro Arg Leu Gin Arg Arg Asp Ala 

290 295 300 

val Gly Leu His Arg Arg Pro Leu Leu Ala Ser Arg Arg Arg Thr pro 
305 310 315 320 

Asp Leu Arg Arg Ala Leu Pro Ala Gin Thr Gly Val Pro Gly Ser Ala 

325 330 335 

Ala Gly Ser Gly Arg Gin Pro Leu ser Leu Phe ser Pro Val Leu Pro 

340 345 350 

Thr Arg Thr Ala Leu 
355 

<210> 101 
<211> 1131 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 101 

atgaagtatt ggcttacaac cctggtttta atgatagcgg gaataccctt ggcttttggt 60 

tcttcagcaa agcaagataa atcaaagagt ttgaaagatg ctttcaaaaa caaattctat 120 

atcggtgtgg ctttgaaccg gagtcaatat ctggaacaaa acgaacaggc ggataaagag 180 

ataaaggcac agttcagctc tattgtagct gagaactgca tgaaaagcga aaatctggaa 240 

cctaaagagg gaaaattctt ctttgacgat gccgatcgtt ttgtcgcttt tggagaaaaa 300 

aatggaatgt acatcattgg acatacctta atttggcatt ctcaagtgcc aaaatggttt 360 

ttcatagata atgaaggcaa agttgtttcc cgggaagttt tgattgaacg aatgaaaaac 420 

tacatccata cagttgtcgg tcattataaa ggtcgagtta aaggttggga tgttgtcaat 480 

gaggccattc tagatgatgg ctcatttaga caaagtaatt tctttaaaat actaggagcc 540 

Page 81 



WO 03/106654 



PCT/US03/19153 



gattttatta aacttgcttt tcaatttgcc catgaagcag atcccaatgc tgagctttat 600 

tacaacgatt attcgatgtc caatccgatc aaaagagacg gagtggttcg catggtgaag 660 

tcattgcagc aacaaggtgt gagaatagac gctatcggaa tgcagggaca cgtagggatg 720 

gattatccca agttggatga gtttgaaaat agtatcaaag ctttttcgtc tttaggaacc 780 

aaagtgatga ttacggaact cgatttaagt gtcctaccaa ctcctaaagg aaaacaaggt 840 

gctaatattt cggatgttgc cgcttatgag gaaaagataa atccttacaa aaatggtctg 900 

ccggctgaag ttgaaaaggc ttgggaagac cggtatttgg attttttcaa attatttttg 960 

aaatatcaac accaaatttc aagggttaca ttatgggggc ttagtgatca ggattcgtgg 1020 

aaaaatgatt tcccagtcag agggagaacg gattatcctt tgcttttcga cagacaatac 1080 

aaaccaaaac ctgtagttca gaaaattatt aaattagcat tgaaaaaata a ~ 1131 

<210> 102 
<211> 376 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...C23) 

<400> 102 

Met Lys Tyr Trp Leu Thr Thr Leu Val Leu Met lie Ala Gly lie Pro 
15 10 15 

Leu Ala Phe Gly Ser ser Ala Lys Gin Asp Lys ser Lys ser Leu Lys 

20 25 30 

Asp Ala Phe Lys Asn Lys Phe Tyr lie Gly Val Ala Leu Asn Arg Ser 

35 40 45 

Gin Tyr Leu Glu Gin Asn Glu Gin Ala Asp Lys Glu lie Lys Ala Gin 

50 55 60 

Phe ser ser lie val Ala Glu Asn cys Met Lys Ser Glu Asn Leu Glu 
65 70 75 80 

Pro Lys Glu Gly Lys Phe Phe Phe Asp Asp Ala Asp Arg Phe Val Ala 

85 90 95 

Phe Gly Glu Lys Asn Gly Met Tyr lie lie Gly His Thr Leu lie Trp 

100 105 110 

His Ser Gin val Pro Lys Trp Phe Phe lie Asp Asn Glu Gly Lys val 

115 120 125 

Val Ser Arg Glu Val Leu lie Glu Arg Met Lys Asn Tyr lie His Thr 

130 135 140 

Val val Gly His Tyr Lys Gly Arg Val Lys Gly Trp Asp Val Val Asn 
145 150 155 160 

Glu Ala lie Leu Asp Asp Gly Ser Phe Arg Gin Ser Asn Phe Phe Lys 

165 ' 170 175 

lie Leu Gly Ala Asp Phe lie Lys Leu Ala Phe Gin Phe Ala His Glu 

180 185 190 

Ala Asp Pro Asn Ala Glu Leu Tyr Tyr Asn Asp Tyr Ser Met ser Asn 

195 200 205 

Pro Thr Lys Arg Asp Gly Val val Arg Met Val Lys Ser Leu Gin Gin 

210 215 220 

Gin Gly Val Arg lie Asp Ala He Gly Met Gin Gly His Val Gly Met 
225 230 235 240 

Asp Tyr pro Lys Leu Asp Glu Phe Glu Asn ser lie Lys Ala Phe ser 

245 250 255 

Ser Leu Gly Thr Lys Val Met lie Thr Glu Leu Asp Leu ser Val Leu 

260 265 270 

Pro Thr Pro Lys Gly Lys Gin Gly Ala Asn He Ser Asp val Ala Ala 

275 280 285 

Tyr Glu Glu Lys lie Asn Pro Tyr Lys Asn Gly Leu Pro Ala Glu val 

290 295 300 

Glu Lys Ala Trp Glu Asp Arg Tyr Leu Asp phe Phe Lys Leu Phe Leu 
305 310 315 320 

Lys Tyr Gin His Gin lie Ser Arg Val Thr Leu Trp Gly Leu Ser Asp 

325 330 335 

Gin Asp Ser Trp Lys Asn Asp Phe Pro Val Arg Gly Arg Thr Asp Tyr 

340 345 350 

pro Leu Leu Phe Asp Arg Gin Tyr Lys Pro Lys Pro val Val Gin Lvs 
355 360 365 
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lie lie Lys Leu Ala Leu Lys Lys 



375 



370 

<210> 103 
<210> 1449 
<212> DNA 
<213> Bacteria 



<220> 

<223> Obtained from an environmental sample 



<400> 103 

atgcgttcac 

gcgctgctcg 

gccgccgaga 

atcgcctcgg 

tcggtgaccg 

gacttcaccg 

ggccacaccc 

gcgctgcgca 

atcggccagt 

gactccaacc 

gccgccgacc 

gccaagaccc 

gactgcgtcg 

accaccctcc 

cagggcgcct 

tgcctcggca 

ccgctgctct 

ctcaactccg 

atcaagggag 

acccaggtcc 

gccggtgagc 

gccaaggtcc 

gacggttcca 

gccaacggca 

cgcacctga 



attcccttcc 
tcggcgccgt 
gcacgctcgg 
gcaggctcaa 
ccgagaacga 
ccggcgaccg 
tggcctggca 
cggcgatgac 
gggacgtcgt 
tccagcggag 
cggccgccaa 
aggccatgta 
gcttccagtc 
agagtttcgc 
cgggcacgac 
tcaccgtctg 
tcaacggcga 
tctcccccaa 
tcgcctcggg 
agctgtggga 
tcagggtcta 
agatctacag 
tcgtcggtgt 
cgctgatcca 



cccgtccacc 
cggcgccgcc 
cgccgcggcg 
cgactcgacg 
gatgaagatc 
cgtctacaac 
ctcccagcag 
caaccacatc 
caacgaggcg 
cggcaacgac 
gctctgctac 
cgccatggtc 
gcacttcaac 
cgccctcggc 
ctacgccaac 

gggtgtccgc 

cggcagcaag 
ccccaacccc 
ccgctgcgtg 
ctgcaacaac 
cggcgacaag 
ctgctggggc 
ccagtccggc 
gctctactcc 



gtccgccgga 
accgtgctcg 
aagcagagcg 
tacacgacga 
gacgccaccg 
tgggcggtgc 
cccgcctgga 
aacggcgtca 
ttcgcggacg 
tggatcgagg 
aacgactaca 
aaggacttca 
aacgacagcc 
gtcgacgtgg 
gtgaccaacg 
gacaccgact 
aagcccgcct 
actccgaccc 
gacgtacccg 
cgcaccaacc 
tgcctggacg 
ggcgacaacc 
ctctgcctcg 
tgctggaaca 



aattgggcgg 
tggcgcccct 
gccggtactt 
tcgcgaaccg 
aaccccagca 
agaacggcaa 
tgcagaacct 
tggcccacta 
gcagttcggg 
tcgccttccg 
acgtcgagaa 
agcagcgcgg 
cctacaacag 
ccatcaccga 
actgcctggc 
cctggcgagc 
actcctccgt 
cctcccccgg 
gagccggcac 
agcagtggac 
ccgccggcac 
agaagtggcg 
acgccgctgc 
gcggcaacca 



cctcggcgcg 
cacctcgcac 
cggcaccgcc 
cgagttcaac 
gggccgcttc 
gcaggtacgg 
cagcggcagc 
caagggcaag 
agcgcgccgg 
caccgcccgc 
ctggacgtgg 
cgtgcccatc 
caacttccgc 
actcgacatc 
cgtcccgcgc 
cgagcacact 
cctcaacgcc 
cgccgggccg 
cgccgacggc 
cctcaccgcc 
cggcaacggc 
cctcaactcc 
cggcggcacc 
gcgctggacc 



<210> 104 
<211> 482 
<212> PRT 
<213> Bacteria 

<220> 

<223> obtained from 

<221> SIGNAL 
<222> CD-.-C41) 

<400> 104 

Met Arg ser His Ser 

1 5 
Gly Leu Gly Ala Ala 
20 

Leu Val Ala Pro Leu 
35 

Ala Ala Lys Gin Ser 
50 

Arg Leu Asn Asp Ser 
65 

ser val Thr Ala Glu 
85 

Gin Gly Arg Phe Asp 
100 

val Gin Asn Gly Lys 
115 

Gin Gin Pro Ala Trp 
130 

Ala Met Thr Asn His 



an environmental sample 



Leu Pro 

Leu Leu 

Thr Ser 

Gly Arg 

Thr Tyr 
70 

Asn Glu 



Pro ser Thr val Arg Arg 
10 

Ala Val Gly Ala 



val Gly 

25 
His Ala 
40 

Tyr Phe 
Thr Thr 
Met Lys 



Lys Leu Gly 
15 

Ala Thr Val 
30 

Leu Gly Ala 



Phe Thr Ala Gl 
Gin val 



105 
Arg Gly 
120 

Asn Leu 



Ala Glu ser Thr 
45 

Gly Thr Ala lie Ala ser Gly 
60 

lie Ala Asn Arg 
75 

He Asp Ala Thr 
90 

Asp Arg Val Tyr 



Glu Phe Asn 
80 

Glu Pro Gin 
95 

Asn Trp Ala 
110 

Trp His ser 



His Thr Leu Ala 
125 

Met Gin Asn Leu Ser Gly ser Ala Leu Arg Thr 

135 140 
lie Asn Gly val Met Ala His Tyr Lys Gly Lys 
Page 83 
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145 150 155 160 

lie Gly Gin Trp Asp Val Val Asn Glu Ala Phe Ala Asp Gly ser ser 

n , 165 170 175 

Gly Ala Arg Arg Asp Ser Asn Leu Gin Arg Ser Gly Asn Asp Trp lie 

180 185 190 

Glu val Ala Phe Arg Thr Ala Arg Ala Ala Asp pro Ala Ala Lys Leu 

195 200 205 

Cys Tyr Asn Asp Tyr Asn Val Glu Asn Trp Thr Trp Ala Lys Thr Gin 

210 215 220 

Ala Met Tyr Ala Met Val Lys Asp Phe Lys Gin Arg Gly Val pro He 
225 230 235 240 

Asp cys val Gly Phe Gin Ser His Phe Asn Asn Asp Ser pro Tyr Asn 

245 250 255 

Ser Asn Phe Arg Thr Thr Leu Gin Ser Phe Ala Ala Leu Gly Val Asp 

„ , 260 265 270 

Val Ala lie Thr Glu Leu Asp lie Gin Gly Ala ser Gly Thr Thr Tyr 

275 280 285 

Ala Asn Val Thr Asn Asp Cys Leu Ala Val Pro Arg Cys Leu Gly lie 

290 295 300 

Thr val Trp Gly Val Arg Asp Thr Asp Ser Trp Arg Ala Glu His Thr 
305 310 315 320 

Pro Leu Leu Phe Asn Gly Asp Gly Ser Lys Lys Pro Ala Tyr Ser Ser 

, 325 330 335 

Val Leu Asn Ala Leu Asn ser val Ser Pro Asn Pro Asn pro Thr Pro 

340 345 350 

Thr Pro ser Pro Gly Ala Gly Pro lie Lys Gly Val Ala Ser Gly Arg 

355 360 365 

cys val Asp val Pro Gly Ala Gly Thr Ala Asp Gly Thr Gin val Gin 

370 375 380 

Leu Trp Asp cys Asn Asn Arg Thr Asn Gin Gin Trp Thr Leu Thr Ala 
385 390 395 400 

Ala Gly Glu Leu Arg Val Tyr Gly Asp Lys cys Leu Asp Ala Ala Gly 

405 410 415 

Thr Gly Asn Gly Ala Lys Val Gin lie Tyr Ser Cys Trp Gly Gly Asp 

420 425 430 

Asn Gin Lys Trp Arg Leu Asn Ser Asp Gly Ser He Val Gly Val Gin 

435 440 445 

Ser Gly Leu cys Leu Asp Ala Ala Ala Gly Gly Thr Ala Asn Gly Thr 

450 455 460 

Leu lie Gin Leu Tyr Ser cys Trp Asn Ser Gly Asn Gin Arg Trp Thr 
465 470 475 480 

Arg Thr 

<210> 105 
<211> 2793 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 105 

atgaagttca ctttgatgcc gctgctgtgc gggttcgcct tgctgttggg ttgcgcggtg 60 

caggcaaccc cagccgcttc gttacagcag gcttatcagc cgtatttcca tatcggtact 120 

gccgtcagct tggcgcaact gcaagcatcg aaaaaccatg aacgagattt aatcgcccag 180 

cactttaaca gtctgaccgc tgaaaacctg atgaaatggg aaaaaatcca accgactgaa 240 

ggcaactttg attttacagc ggccgacaag ctcgtcgctt ttgctgaaca acatcggatg 300 

tggctggtcg gccatacgat cctgtggcat gaacaaaccc cggactgggt atttcagggg 360 

ccagatggca aaccggccag caagcaagtg ttactcggca gattaaaaaa gcatatccaa 420 

actgtggtcg gtcgttacca aggtcgggta catggctggg atgtagtgaa tgaagcgctg 480 

aatgaagatg gcagtctgcg cgatacgccg tggcgaaaaa ttctgggtga tgattacatt 540 

gccaccactt ttgcgctggt gcatcaggtc gaccccaaag ccaaactcta ttacaacgac 600 

tacaacctgt ataaaccaaa aaaacgcact ggcgtgctac ggatcatcca gcaactgcag 660 

caacaacaag tgcccattca tgccattggc gaacaagcgc attatggtct cgattcgccg 720 

aaattgcagg aagttgaaga ctcgatcaac gcctttgcag ccaccggcct cgacgtgatg 780 

ctgaccgagt tggaaatttc ggtgctaccg tttccgcctg gcatgacacc aggcgccgat 840 

atcagtcagc atcaggaact gcaacaacag ctgaatcctt accgcgaagg cttaccaaaa 900 

accgtcgaac aggcctggca acaacgttat ctggatctgt tttcgctgtt attgcgccag 960 
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catcaaaaat 
aactttccaa 
aaaccgctat 
gtaaatcagc 
gcggtgtcgt 
tcggtggctc 
ttaaccaccc 
attactgctg 
cgcgcctcgc 
ccggataaca 
gtgatcagcg 
tccggtattt 
gacagaacat 
ttatggaatt 
aagctgactg 
tatgtggtgc 
agtcgcattt 
caagcattag 
gccgatgttc 
ggggcggagc 
gagacgccaa 
gcatccgctg 
gccgcggcgc 
aaagattttg 
gcgtggaaaa 
gtcctcggtc 
caacatatcc 
ttagtgggtg 
accggcactt 
agctacgcca 
tggagccttt 



tacaccgggt 
tgcgcggtcg 
tgagcgcact 
tcggttttgc 
ttcagatcat 
agttttggcc 
aagggcgtta 
aaccttatgc 
tggcgctgga 
aagtgttggt 
ccgctaaagg 
ccagttacac 
ggaatcttcc 
tacagtggct 
aactgaattt 
aaaaaaccac 
ttacagaatt 
cagcctggca 
acaccggtgc 
tatatttatt 
tcaccgcagc 
aacagtttga 
aaattgtagc 
tctggggcag 
ttgacccaca 
gcaacccgct 
atcaccgccc 
gtgcacagcc 
tacccgctgc 
ccaatgaagt 
caccagactc 



gacgttttgg 
taccgattac 
gatcaaactg 
gccaaatgcg 
caatcaaagc 
cgaatcgggc 
tcaggttgaa 
cgcgctgcat 
gccaagtttt 
gcacacttcc 
ctggtatgac 
cctattgcaa 
ggagtccagc 
gagcaccatg 
ctctgctacc 
ggcagcggca 
tgaaacgcaa 
atgggcgcaa 
ttatggcgac 
aaccggtgag 
ttcctgggcc 
gcctgcactt 
cgagcatcaa 
taatgcggtg 
accagagctg 
gcagctgtct 
ctcggcggca 
gggtaagcaa 
cagcacttta 
ggcgattaac 
catgaccaaa 



ggtttagatg 
ccgctactgt 
gcagaaactc 
caaaaattgc 
aacggcaaaa 
gagtgggtca 
gcggctggat 
gatgcgtcca 
gccgggcctt 
gccgcttccg 
gccggtgact 
gcctggcagg 
aacaacctac 
caggacccaa 
caaatgccgt 
cttaatttcg 
ctgcccggcc 
aaaaatccac 
aaacagctgg 
cagagttacc 
aatgtggcgg 
cgaaaaaaag 
gcgtccgcct 
gcgatgaaca 
cgacaggcga 
tatgtcacag 
gatcagatca 
gataaatgct 
cctgcaacga 
tggaatgcac 
tga 



atggccaaag 
ttgaccgcaa 
aagcctcagc 
tggtggtgcc 
cggtgttgca 
gtatcgctga 
taactccgat 
tcaaagccta 
gggcgcgcgc 
acaagcgacc 
ataacaaata 
attttcctga 
cagacattct 
gcgacggcgg 
cagaagtgac 
cggcggtgct 
tgtcacagca 
aacaaattta 
ctgatgaatg 
tgcagccgtt 
cgttgggtta 
tgcagcaaaa 
accaggtggc 
aaggcatgtt 
tgcaagggct 
gttttggtgc 
aagcaccagt 
cttattccgg 
cttatctcga 
ctttagtgta 



ctggcgcaat 
gctgcaagcc 
caagccgaaa 

ggggcggcag 

aggccaaagt 
cttttcgacc 
caccgtcgag 
ttattttaat 
tgccggtcat 
agccggtttt 
cgtggtcaat 
gttttatcgc 
cgacgagacg 
cgtgtatcac 
agcgccacgt 
ggccaaagcc 
atatcgccag 
tcagcaacca 
ggcttgggct 
gttggcactt 
ttttgcgttg 
aatccaacaa 
gatgactcaa 
gttatatcaa 
gctggattac 
gcaaagcccg 
gccgggctgg 
tatttttgct 
ccactggtgc 
cgtgctggcc 



<210> 106 
<211> 930 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (1)... (22) 

<400> 106 

Met Lys phe Thr Leu 

1 5 
Gly Cys Ala Val Gin 
20 

Gin Pro Tyr Phe His 
35 

Ala ser Lys Asn His 
50 

Leu Thr Ala Glu Asn 
65 

Gly Asn Phe Asp Phe 
85 

Gin His Arg Met Trp 
100 

Thr pro Asp Trp Val 
115 

Gin val Leu Leu Gly 
130 

Arg Tyr Gin Gly Arg 
145 

Asn Glu Asp Gly Ser 
165 

Asp Asp Tyr lie Ala 
180 

Lys Ala Lys Leu Tyr 



an environmental sample 



Met Pro Leu Leu 
Ala Thr 
lie Gly 



Glu Arg 

55 
Leu Met 
70 

Thr Ala 

Leu Val 

Phe Gin 

Arg Leu 
135 
val His 
150 

Leu Arg 
Thr Thr 
Tyr Asn 



Pro Ala 
25 

Thr Ala 
40 
Asp Leu 



Lys Trp 

Ala Asp 

Gly His 
105 
Gly Pro 
120 

Lys Lys 

Gly Trp 

Asp Thr 

Phe Ala 
185 
Asp Tyr 



Cys Gly Phe 
10 

Ala ser Leu 

val ser Leu 

lie Ala Gin 
60 

Glu Lys lie 
75 

Lys Leu Val 
90 

Thr lie Leu 

Asp Gly Lys 

His He Gin 
140 

Asp val val 

155 
Pro Trp Arg 
170 

Leu val His 

Asn Leu Tyr 
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Ala Leu 

Gin Gin 

30 
Ala Gin 
45 

His Phe 

Gin Pro 

Ala Phe 

Trp His 
110 
Pro Ala 
125 

Thr Val 

Asn Glu 

Lys lie 

Gin Val 
190 
Lys Pro 



Leu Leu 
15 

Ala Tyr 

Leu Gin 

Asn Ser 

Thr Glu 
80 

Ala Glu 
95 

Glu Gin 

Ser Lys 

Val Gly 

Ala Leu 
160 
Leu Gly 
175 

Asp Pro 
Lys Lys 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2793 
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195 200 205 

Arg Thr Gly val Leu Arg lie lie Gin Gin Leu Gin Gin Gin Gin val 

210 215 220 

Pro He His Ala lie Gly Glu Gin Ala His Tyr Gly Leu Asp ser Pro 
225 230 235 240 

Lys Leu Gin Glu val Glu Asp ser He Asn Ala Phe Ala Ala Thr Gly 

245 250 255 

Leu Asp Val Met Leu Thr Glu Leu Glu lie ser val Leu Pro Phe Pro 

260 265 270 

Pro Gly Met Thr Pro Gly Ala Asp He Ser Gin His Gin Glu Leu Gin 
n , 275 280 285 

Gin Gin Leu Asn Pro Tyr Arg Glu Gly Leu Pro Lys Thr Val Glu Gin 

290 295 300 

Ala Trp Gin Gin Arg Tyr Leu Asp Leu Phe ser Leu Leu Leu Arg Gin 
305 310 315 320 

His Gin Lys Leu His Arg val Thr Phe Trp Gly Leu Asp Asp Gly Gin 

325 330 335 

ser Trp Arg Asn Asn Phe Pro Met Arg Gly Arg Thr Asp Tyr pro Leu 

340 345 350 

Leu Phe Asp Arg Lys Leu Gin Ala Lys Pro Leu Leu Ser Ala Leu lie 

355 360 365 

Lys Leu Ala Glu Thr Gin Ala ser Ala Lys Pro Lys Val Asn Gin Leu 

370 375 380 

Gly Phe Ala Pro Asn Ala Gin Lys Leu Leu Val Val Pro Gly Arg Gin 
385 390 395 400 

Ala Val Ser Phe Gin He lie Asn Gin Ser Asn Gly Lys Thr Val Leu 

405 410 415 

Gin Gly Gin ser ser val Ala Gin Phe Trp Pro Glu Ser Gly Glu Trp 

420 425 430 

Val ser lie Ala Asp Phe ser Thr Leu Thr Thr Gin Gly Arg Tyr Gin 

435 440 445 

Val Glu Ala Ala Gly Leu Thr pro He Thr val Glu lie Thr Ala Glu 

450 455 460 

Pro Tyr Ala Ala Leu His Asp Ala Ser lie Lys Ala Tyr Tyr Phe Asn 
465 470 475 480 

Arg Ala ser Leu Ala Leu Glu Pro Ser Phe Ala Gly Pro Trp Ala Arg 

. 485 490 495 

Ala Ala Gly His Pro Asp Asn Lys val Leu val His Thr ser Ala Ala 

500 505 510 

Ser Asp Lys Arg Pro Ala Gly Phe val lie ser Ala Ala Lys Gly Trp 

515 520 525 

Tyr Asp Ala Gly Asp Tyr Asn Lys Tyr Val Val Asn Ser Gly lie Ser 

530 535 540 

Ser Tyr Thr Leu Leu Gin Ala Trp Gin Asp Phe Pro Glu Phe Tyr Arg 
545 550 555 560 

Asp Arg Thr Trp Asn Leu Pro Glu ser ser Asn Asn Leu Pro Asp He 

, , 565 570 575 

Leu Asp Glu Thr Leu Trp Asn Leu Gin Trp Leu Ser Thr Met Gin Asp 

580 585 590 

Pro ser Asp Gly Gly Val Tyr His Lys Leu Thr Glu Leu Asn Phe Ser 

595 600 605 

Ala Thr Gin Met Pro ser Glu val Thr Ala Pro Arg Tyr Val Val Gin 

610 615 620 

Lys Thr Thr Ala Ala Ala Leu Asn Phe Ala Ala val Leu Ala Lys Ala 
625 630 635 640 

Ser Arg He Phe Thr Glu Phe Glu Thr Gin Leu Pro Gly Leu ser Gin 

, 645 650 655 

Gin Tyr Arg Gin Gin Ala Leu Ala Ala Trp Gin Trp Ala Gin Lys Asn 

660 665 670 

Pro Gin Gin lie Tyr Gin Gin Pro Ala Asp val His Thr Gly Ala Tyr 

675 680 685 

Gly Asp Lys Gin Leu Ala Asp Glu Trp Ala Trp Ala Gly Ala Glu Leu 

690 695 700 

Tyr Leu Leu Thr Gly Glu Gin Ser Tyr Leu Gin Pro Leu Leu Ala Leu 
705 710 715 720 

Glu Thr Pro lie Thr Ala Ala Ser Trp Ala Asn Val Ala Ala Leu Gly 

, , 725 730 735 

Tyr Phe Ala Leu Ala Ser Ala Glu Gin phe Glu Pro Ala Leu Arg Lys 
740 745 750 
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Lys val Gin Gin Lys He Gin Gin Ala Ala Ala Gin He Val Ala Glu 

755 760 765 

His Gin Ala Ser Ala Tyr Gin val Ala Met Thr Gin Lys Asp Phe Val 

770 775 780 

Trp Gly Ser Asn Ala Val Ala Met Asn Lys Gly Met Leu Leu Tyr Gin 
785 790 795 800 

Ala Trp Lys lie Asp Pro Gin Pro Glu Leu Arg Gin Ala Met Gin Gly 

805 810 815 

Leu Leu Asp Tyr Val Leu Gly Arg Asn Pro Leu Gin Leu Ser Tyr val 

820 825 830 

Thr Gly Phe Gly Ala Gin Ser Pro Gin His He His His Arg Pro ser 

835 840 845 

Ala Ala Asp Gin He Lys Ala Pro val Pro Gly Trp Leu val Gly Gly 

850 855 860 

Ala Gin Pro Gly Lys Gin Asp Lys Cys Ser Tyr ser Gly He Phe Ala 
865 870 875 880 

Thr Gly Thr Leu Pro Ala Ala ser Thr Leu Pro Ala Thr Thr Tyr Leu 

885 890 895 

Asp His Trp Cys Ser Tyr Ala Thr Asn Glu val Ala lie Asn Trp Asn 

900 905 910 

Ala Pro Leu Val Tyr Val Leu Ala Trp ser Leu ser Pro Asp Ser Met 
915 920 925 

Thr Lys 
930 

<210> 107 

<211> 1725 

<212> DNA 

<213> Bacteria 

<400> 107 

gtgtggaagc ccggattgtg gaatttcctt caaatggcag atgaagccgg attgacgagg 50 

gatggaaaca ctccggttcc gacacccagt ccaaagccgg ctaacacacg tattgaagcg 120 

gaagattatg acggtattaa ttcttcaagt attgagataa taggtgttcc acctgaagga 180 

ggcagaggaa taggttatat taccagtggt gattatctgg tatacaagag tatagacttt 240 

ggaaacggag caacgtcgtt taaggccaag gttgcaaatg caaatacttc caatattgaa 300 

cttagattaa acggtccgaa tggtactctc ataggcacac tctcggtaaa atccacagga 360 

gattggaata catatgagga gcaaacttgc agcattagca aagtcaccgg aataaatgat 420 

ttgtacttgg tattcaaagg ccctgtaaac atagactggt tcacttttgg cgttgaaagc 480 

agttccacag gtctggggga tttaaatggt gacggaaata ttaactcgtc ggaccttcag 540 

gcgttaaaga ggcatttgct cggtatatca ccgcttacgg gagaggctct tttaagagcg 600 

gatgtaaata ggagcggcaa agtggattct actgactatt cagtgctgaa aagatatata 660 

ctccgcatta ttacagagtt ccccggacaa ggtgatgtac agacacccaa tccgtctgtt 720 

actccgacac aaactcctat ccccacgatt tcgggaaatg ctcttaggga ttatgcggag 780 

gcaaggggaa taaaaatcgg aacatgtgtc aactatccgt tttacaacaa ttcagatcca 840 

acctacaaca gcattttgca aagagaattt tcaatggttg tatgtgaaaa tgaaatgaag 900 

tttgatgctt tgcagccgag acaaaacgtt tttgattttt cgaaaggaga ccagttgctt 960 

gcttttgcag aaagaaacgg tatgcagatg aggggacata cgttgatttg gcacaatcaa 1020 

aacccgtcat ggcttacaaa cggtaactgg aaccgggatt cgctgcttgc ggtaatgaaa 1080 

aatcacatta ccactgttat gacccattac aaaggtaaaa ttgttgagtg ggatgtggca 1140 

aacgaatgta tggatgattc cggcaacggc ttaagaagca gcatatggag aaatgtaatc 1200 

ggtcaggact accttgacta tgctttcagg tatgcaagag aagcagatcc cgatgcactt 1260 

cttttctaca atgattataa tattgaagac ttgggtccaa agtccaatgc ggtatttaac 1320 

atgattaaaa gtatgaagga aagaggtgtg ccgattgacg gagtaggatt ccaatgccac 1380 

tttatcaatg gaatgagccc cgagtacctt gccagcattg atcaaaatat taagagatat 1440 

gcggaaatag gcgttatagt atcctttacc gaaatagata tacgcatacc tcagtcggaa 1500 

aacccggcaa ctgcattcca ggtacaggca aacaactata aggaacttat gaaaatttgt 1560 

ctggcaaacc ccaattgcaa tacctttgta atgtggggat tcacagataa atacacatgg 1620 

attccgggaa ctttcccagg atatggcaat ccattgattt atgacagcaa ttacaatccg 1680 

aaaccggcat acaatgcaat aaaggaagct cttatgggct attga 1725 

<210> 108 
<211> 574 
<212> PRT 
<213> Bacteria 

<400> 108 , 

Val Trp Lys Pro Gly Leu Trp Asn Phe Leu Gin Met Ala Asp Glu Ala 
1 5 10 15 
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Gly Leu 

Pro Ala 

Ser Ser 
50 

Gly Tyr 
65 

Gly Asn 

Ser Asn 

Thr Leu 

Thr cys 
130 
Phe Lys 
145 

Ser Ser 

ser Asp 

Thr Gly 

Asp Ser 
210 
Thr Glu 
225 

Thr Pro 

Asp Tyr 

Pro Phe 

Glu Phe 
290 
Gin Pro 
305 

Ala Phe 

Trp His 

Asp Ser 

His Tyr 
370 

ASp Asp 
385 

Gly Gin 

Pro Asp 

pro Lys 

Gly val 
450 
Met Ser 
465 

Ala Glu 

pro Gin 

Tyr Lys 

Phe val 
530 
Phe Pro 
545 

Lys Pro 



Thr Arg Asp 
20 

Asn Thr Arg 
35 

He Glu lie 

lie Thr Ser 

Gly Ala Thr 
85 

He Glu Leu 

100 
Ser Val Lys 
115 

ser lie ser 

Gly Pro Val 

Thr Gly Leu 
165 

Leu Gin Ala 

180 
Glu Ala Leu 
195 

Thr Asp Tyr 

Phe Pro Gly 

Thr Gin Thr 
245 

Ala Glu Ala 

260 
Tyr Asn Asn 
275 

ser Met val 

Arg Gin Asn 

Ala Glu Arg 
325 

Asn Gin Asn 

340 
Leu Leu Ala 
355 

Lys Gly Lys 

ser Gly Asn 

Asp Tyr Leu 
405 

Ala Leu Leu 

420 
ser Asn Ala 
435 

Pro lie Asp 



Gly Asn Thr Pro Val Pro Thr Pro Ser Pro Lys 
25 30 
Glu Asp Tyr Asp Gly lie Asn ser 
45 

Pro Pro Glu Gly Gly Arg Gly He 
60 

Leu Val Tyr Lys Ser 

Ala Lys val Ala Asn 
90 



He Glu Ala 
40 

lie Gly val 

55 

Gly Asp Tyr 
70 

Ser Phe Lys 



Arg Leu Asn 

ser Thr Gly 
120 

Lys val Thr 

135 
Asn lie Asp 
150 

Gly Asp Leu 

Leu Lys Arg 

Leu Arg Ala 
200 

Ser Val Leu 

215 
Gin Gly Asp 
230 

Pro lie Pro 

Arg Gly He 

Ser Asp Pro 
280 

val cys Glu 

295 
Val Phe Asp 
310 

Asn Gly Met 

Pro ser Trp 

Val Met Lys 
360 

lie val Glu 

375 
Gly Leu Arg 
390 

Asp Tyr Ala 
Phe Tyr Asn 



Val Phe Asn 
440 

Gly Val Gly 
455 

Pro Glu Tyr Leu Ala ser 
470 

lie val ser 



He Gly val 
485 

Ser Glu Asn 

500 
Glu Leu Met 
515 

Met Trp Gly 
Gly Tyr Gly 
Ala Tyr Asn 



Pro Ala Thr 

Lys lie cys 
520 

Phe Thr Asp 

535 
Asn Pro Leu 
550 

Ala lie Lys 



Gly Pro Asn Gly Thr 
105 

Asp Trp Asn Thr Tyr 
125 

Gly lie Asn Asp Leu 
140 

Trp Phe Thr Phe Gly 
155 

Asn Gly Asp Gly Asn 
170 

His Leu Leu Gly lie 
185 

Asp Val Asn Arg ser 
205 

Lys Arg Tyr lie Leu 
220 

Val Gin Thr Pro Asn 
235 

Thr lie Ser Gly Asn 
250 

Lys He Gly Thr Cys 
265 

Thr Tyr Asn ser lie 
285 

Asn Glu Met Lys Phe 
300 

Phe ser Lys Gly Asp 
315 

Gin Met Arg Gly His 
330 

Leu Thr Asn Gly Asn 
345 

Asn His lie Thr Thr 
365 

Trp Asp Val Ala Asn 
380 

Ser ser lie Trp Arg 
395 

Phe Arg Tyr Ala Arg 
410 

Asp Tyr Asn lie Glu 
425 

Met lie Lys Ser Met 
445 

Phe Gin Cys His Phe 
460 

lie Asp Gin Asn lie 
475 

Phe Thr Glu lie Asp 
490 

Ala Phe Gin Val Gin 
505 

Leu Ala Asn Pro Asn 
525 

Lys Tyr Thr Trp lie 
540 

lie Tyr Asp ser Asn 
555 

Glu Ala Leu Met Gly 
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lie Asp Phe 
80 

Ala Asn Thr 
95 

Leu lie Gly 
110 

Glu Glu Gin 



Tyr Leu Val 

val Glu ser 
160 

lie Asn ser 

175 
Ser Pro Leu 
190 

Gly Lys val 

Arg lie lie 

Pro Ser Val 
240 

Ala Leu Arg 

255 
Val Asn Tyr 
270 

Leu Gin Arg 
Asp Ala Leu 

Gin Leu Leu 

320 

Thr Leu lie 

335 
Trp Asn Arg 
350 

val Met Thr 

Glu Cys Met 

Asn val lie 
400 

Glu Ala Asp 

415 
Asp Leu Gly 
430 

Lys Glu Arg 

lie Asn Gly 

Lys Arg Tyr 
480 

lie Arg lie 

495 
Ala Asn Asn 
510 

cys Asn Thr 

Pro Gly Thr 

Tyr Asn Pro 
560 

Tyr 
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780 
840 
900 
960 



565 570 

<210> 109 
<211> 1242 
<212> DNA 
<213> unknown 

<220> ■ , 

<223> obtained from an environmental sample 

<400> 109 ^ 
atgctaaaag ttttacgtaa acctattatt tctggattag ctttagctct attattgccg 60 
qcaggggcag ctggtgccga aactaatatt tcaaagaagc caaatataag tggattaacc 120 
gcgccgcaat tagaccaaag atataaagat tctttcacca ttggtgctgc ggttgagccg 180 
tatcaattat tagatgcaaa agattcacaa atgctaaagc ggcattttaa tagtatcgta 240 
gcagagaatg tcatgaagcc tagtagttta cagccagtag aaggacaatt caattgggag 300 
ccggccgata aacttgttca gtttgcgaag gaaaatggaa tggacatgcg cggacatacg 360 
cttgtctggc atagccaggt accggattgg ttctttgaag atgcggcagg aaatccaatg 420 
gttgtttggg aaaatggcag gcaagtggtt gccgatccag caaatcttca ggaaaacaaa 480 
gagctcttac ttagccgatt acaaaatcat attcaggcag tcgtaacgcg ttataaagat 540 
gatataaaat cttgggatgt tgttaatgaa gtaatcgatg aatggggcgg acattctgaa 600 
gggctgcgtc aatctccatg gttcctcatc accggaacgg actatattaa agttgctttt 660 
gaaactgcaa gagaatatgc agctccagac gctaagctgt atatcaatga ttacaataca 720 
gaagtagaac caaaaaggac gcacctttat aacttagtaa aaagtttaaa agaagaacaa 7Rn 
aacgttccaa ttgatggtgt tgggcatcag tctcacattc aaattggctg gccttcagaa 
aaagaaattg aagataccat taatatgttt gcagatcttg gtttagataa ccaaatcacc 
gagcttgatg ttagtatgta tggctggcca gtaaggtcgt atccaactta tgatgcgatc 
ccagaactta aattcatgga tcaagcagct cgttatgatc gtttatttaa gttatatgag 1020 
aaattaggag ataaaatcag taatgtgaca ttctggggta ttgcggataa ccatacatgg 1080 
ctgaatgacc gtgcagatgt ttactatgat gaaaatggaa atgttgtatt agatagagaa 1140 
acaccaagag tagaaagagg agcaggaaaa gatgcgccat ttgtatttga tcctgaatac 1200 
aatgtaaaac cagcttattg ggcaattatc gaccacaaat aa ±m 

<210> 110 
<211> 413 
<212> PRT 
<213> unknown 

< 220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C26) 

<400> 110 n , . 

Met Leu Lys Val Leu Arg Lys Pro lie lie ser Gly Leu Ala Leu Ala 
1 5 10 15 

Leu Leu Leu Pro Ala Gly Ala Ala Gly Ala Glu Thr Asn lie ser Lys 

20 25 30 

Lvs pro Asn lie ser Gly Leu Thr Ala Pro Gin Leu Asp Gin Arg Tyr 

35 40 . , 4 * 

Lys Asp ser Phe Thr He Gly Ala Ala val Glu Pro Tyr Gin Leu Leu 

50 55 60 

Asp Ala Lys Asp ser Gin Met Leu Lys Arg His Phe Asn ser He val 

Ala Glu Asn val Met Lys Pro ser ser Leu Gin Pro val Glu Gly Gin 

85 90 95 

Phe Asn Trp Glu Pro Ala Asp Lys Leu Val Gin Phe Ala Lys Glu Asn 

100 105 110 

Gly Met Asp Met Arg Gly His Thr Leu val Trp His ser Gin val Pro 

y 115 120 125 

Asp Trp Phe Phe Glu Asp Ala Ala Gly Asn Pro Met val val Trp Glu 

P 130 135 n 140 

Asn Gly Arg Gin val val Ala Asp Pro Ala Asn Leu Gin Glu Asn Lys 
145 150 155 160 

Glu Leu Leu Leu Ser Arg Leu Gin Asn His He Gin Ala Val Val Thr 

165 170 175 

Arg Tyr Lys Asp Asp He Lys Ser Trp Asp val val Asn Glu Val He 
y y 180 185 190 
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Asp Glu Trp Gly Gly His ser Glu Gly Leu Arg Gin ser pro Trp Phe 

195 200 205 

Leu lie Thr Gly Thr Asp Tyr lie Lys val Ala Phe Glu Thr Ala Arg 

210 215 220 

Glu Tyr Ala Ala Pro Asp Ala Lys Leu Tyr lie Asn Asp Tyr Asn Thr 
225 230 235 240 

Glu Val Glu Pro Lys Arg Thr His Leu Tyr Asn Leu Val Lys Ser Leu 

245 250 255 . 

Lys Glu Glu Gin Asn Val Pro lie Asp Gly Val Gly His Gin ser His 

260 265 270 

lie Gin lie Gly Trp Pro ser Glu Lys Glu lie Glu Asp Thr lie Asn 

275 280 285 

Met Phe Ala Asp Leu Gly Leu Asp Asn Gin lie Thr Glu Leu Asp Val 

290 295 300 

ser Met Tyr Gly Trp Pro val Arg Ser Tyr Pro Thr Tyr Asp Ala lie 
305 310 315 320 

pro Glu Leu Lys Phe Met Asp Gin Ala Ala Arg Tyr Asp Arg Leu Phe 

325 330 335 

Lys Leu Tyr Glu Lys Leu Gly Asp Lys lie ser Asn Val Thr Phe Trp 

340 345 350 

Gly lie Ala Asp Asn His Thr Trp Leu Asn Asp Arg Ala Asp Val Tyr 

355 360 365 

Tyr Asp Glu Asn Gly Asn Val Val Leu Asp Arg Glu Thr Pro Arg val 

370 375 380 

Glu Arq Gly Ala Gly Lys Asp Ala Pro Phe val Phe Asp Pro Glu Tyr 
385 390 395 400 

Asn val Lys Pro Ala Tyr Trp Ala lie lie Asp His Lys 
405 410 

<210> 111 
<211> 1089 
<212> DNA 
<213> Unknown 

< 220> 

<223> Obtained from an environmental sample 

<400> 111 „ 

atgttgacga ccccgacaac tcaagatcat gtccccgtgc ttaaggacgc tttcaaaggc 60 

aagttcctca ttggagccgt gctgggttat gacgcactcc agggaaagga tccggcgagt 120 

gtggaaattg cgaccacgca cttcgatgct ctcactgcgg aaaacagcat gaagcccgct 180 

ctggtgcaac ctaaagaggg cgaatttgac ttcgctgatg gagaccggct tcttgacatc 240 

acacagcagt gcggtgcgac tgcgattggc cacactttgc tctggcacca acagacaccg 300 

aaatggtttt tcgaggggcc agatgaccag cctactaacc gcgagttggc cctggcacgc 360 

atgagaaagc acatcgccac tcttgttggc cgttacaaag gtcgcattaa gcaatgggat 420 

gtggtgaatg aggcgattag cgatgcagag ggcgagtact tgagaccaaa tagtccatgg 480 

ttcaaggctg ttggagaaga tcacattgcg caggctttcc gggcagcgca cgaagccgat 540 

cctgacgcca tcctcatcta taacgattac aacatcgagc aggagtacaa gcgtcccaaa 600 

gcgatacgac tgctgaggtc attacttgag caggacgttc cccttcatgc cgtgggcatc 660 

cagggccact ggcgtatgga cactctgaat gttgccgaaa tcgaagaagc tatcaaagaa 720 

tttgctgcgc tgggtctcaa ggtcatgatc accgagcttg acatcagcgt gctaccgaca 780 

aagtatcagg gagccgatct ctctacccgc gaagaattga cgcctgaaat caatccctat 840 

acggagggac tacccgagaa cgttgcccgg caacatgccg aatgttaccg ccaagtcttc 900 

aaaatgttcc tgtgccacaa ggatgccatt ggccgtgtca cgctctgggg cgttcatgat 960 

ggcagatcat ggttcaatga ctttcccgtc agagggcgca ccgattatcc tctgcttttc 1020 

gaccggcagg gcaaacccaa gccagcattt tttgccgtct tgaaggctgc gcaagatcag 1080 
ccacaatga 

<210> 112 
<211> 362 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 112 

Met Leu Thr Thr Pro Thr Thr Gin Asp His Val Pro Val Leu Lys Asp 
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Ala Phe Lys Gly Lys Phe Leu He Gly Ala Val Leu Gly Tyr Asp Ala 

20 25 30 

Leu Gin Gly Lys Asp Pro Ala ser Val Glu lie Ala Thr Thr His Phe 

35 40 45 

Asp Ala Leu Thr Ala Glu Asn Ser Met Lys Pro Ala Leu Val Gin pro 

50 55 n 60 

Lys Glu Gly Glu Phe Asp Phe Ala Asp Gly Asp Arg Leu Leu Asp He 
65 70 75 80 

Thr Gin Gin Cys Gly Ala Thr Ala He Gly His Thr Leu Leu Trp ms 

85 90 95 

Gin Gin Thr Pro Lys Trp Phe Phe Glu Gly Pro Asp Asp Gin Pro Thr 

100 105 . 110 

Asn Arg Glu Leu Ala Leu Ala Arg Met Arg Lys His lie Ala Thr Leu 

115 120 125 

val Gly Arg Tyr Lys Gly Arg lie Lys Gin Trp Asp Val Val Asn Glu 

130 135 140 

Ala lie ser Asp Ala Glu Gly Glu Tyr Leu Arg Pro Asn ser Pro Trp 
145 150 155 160 

Phe Lys Ala Val Gly Glu Asp His lie Ala Gin Ala Phe Arg Ala Ala 

165 170 175 

His Glu Ala Asp Pro Asp Ala lie Leu lie Tyr Asn Asp Tyr Asn He 

180 185 190 

Glu Gin Glu Tyr Lys Arg Pro Lys Ala lie Arg Leu Leu Arg Ser Leu 

195 " 200 205 

Leu Glu Gin Asp Val Pro Leu His Ala Val Gly lie Gin Gly His Trp 

210 215 220 

Arq Met Asp Thr Leu Asn val Ala Glu lie Glu Glu Ala lie Lys Glu 
225 230 235 240 

Phe Ala Ala Leu Gly Leu Lys val Met lie Thr Glu Leu Asp lie Ser 

245 250 255 

Val Leu Pro Thr Lys Tyr Gin Gly Ala Asp Leu Ser Thr Arg Glu Glu 

260 265 270 

Leu Thr Pro Glu lie Asn Pro Tyr Thr Glu Gly Leu Pro Glu Asn val 

275 280 285 

Ala Arg Gin His Ala Glu cys Tyr Arg Gin Val Phe Lys Met Phe Leu 

290 295 300 

cys His Lys Asp Ala lie Gly Arg Val Thr Leu Trp Gly Val His Asp 
305 310 315 320 

Gly Arg ser Trp Phe Asn Asp Phe Pro Val Arg Gly Arg Thr Asp Tyr 

325 330 335 

pro Leu Leu Phe Asp Arg Gin Gly Lys Pro Lys Pro Ala Phe Phe Ala 

340 345 350 

Val Leu Lys Ala Ala Gin Asp Gin Pro Gin 
355 360 

<210> 113 
<211> 1155 
<212> DNA 
<213> Unknown 

< 220> 

<223> obtained from an environmental sample 
<400> 113 

atgttaaaag tattgcgtaa accacttttt tctggattag ctttagcgat agtattacct 60 

accggattat ccagtgctta tgcagctgaa aatcaaccag ttagtgcatt agatgcagcg 120 

gttgaacttg atgaaagata tgcagaatca ttcgatattg gtgcagccgt tgagccttct 180 

atgcttcaag gaaaagatgc tgaagtatta aagcgtcatt ataacagcat tgtggccgaa 240 

aatgtaatga aaccgattaa tatacagcct gaagaaggaa agttcacttt taaagaaatg 300 

gataaaatcg ttaagtttgc gaaagaaaat aatatgaagc ttcgtggcca tacccttatt 360 

tggcacagtc aagtaccgga gtggttcttc cttgataaag aaggaaataa gatggtggat 420 

gaaacggatc caaagcagcg cgaaaaaaat aaaaggcttt tacttaagcg tttagaaacg 480 

catattaaaa cgatcgtcaa gcgctataaa aatgatatta gctcctggga cgtggtcaac 540 

gaggtagtgg atgataacgg gaaattacgt aattcaccct ggtatcaaat cacaggtact 600 

gattatatca aggttgcttt tgaaacagcg gaccgttatg cagggaagaa cgctaagctt 660 

tatatcaatg actacaacac ggaaatagac cctaaaagag aaaccctcta taatcttgtc 720 

aaggaattag tgaaggaggg agtcccagtt gatggagtgg gacatcaagc tcatatccaa 780 

atcggctggc caactatagc ggaaatcgag aaaaccatta atatgtttgc agaccttggc 840 

ctagacaatc aaattacaga actagatgtt agcctttatg ggtggccgcc aaagcctgct 900 
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tacccaactt atgacgaaat cccggcaagt gaattcgaac gtcaagctgt tcgttacgat 960 

caactatttg atttatacga gagattggga gataaaatta gcagtgtgac attctggggc 1020 

gttgctgaca accatacatg gttaaatgac cgtgcagaac aatataatga cggggtaggc 1080 

gtggacgcac catttgtttt cgataaggat tataatgtaa aaccagctta ttgggctatt 1140 

atcgatcgcg attaa 1155 

<210> 114 
<211> 384 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 



<221> SIGNAL 
<222> (1) . . . (28) 



<400> 114 
















Gly 




Ala 




Ala 


Met 


Leu 


Lys 


val Leu 


Arg 


Lys 


pro 


Leu 


Phe 


ser 


Leu 


Leu 


1 




5 






10 










15 


Gin 


He 


val 


Leu 


Pro Thr 


Gly 


Leu 


ser 


Ser 


Ala 


Tyr 


Ala 


Ala 


Glu 


Asn 








20 






25 








30 




Ala 


Pro 


val 


Ser 


Ala Leu 


Asp 


Ala 


Ala 


Val 


Glu 


Leu 


Asp 


Glu 


Arg 


Tyr 






35 






40 










45 




Gin 


Gly 


Glu 


Ser 


Phe 


Asp lie 


Gly 


Ala 


Ala 


val 


Glu 


Pro 


ser 


Met 


Leu 




50 




55 










60 


lie 


Val 


Ala 


GlU 


Lys 


Asp 


Ala 


Glu Val 


Leu 


Lys 


Arg 


His 


Tyr 


Asn 


Ser 


65 






70 








75 










80 


Asn 


val 


Met 


Lys Pro 
85 


lie 


Asn 


lie 


Gin 


Pro 
90 


Glu 


GlU 


Gly 


Lys 


Phe 
95 


Thr 


Phe 


Lys 


Glu 


Met Asp 


Lys 


lie 


Val 


Lys 


Phe 


Ala 


Lys 


GlU 


Asn 


Asn 


Met 






100 






105 






Gin 


val 


110 


GlU 




Lys 


Leu 


Arg 


Gly His 


Thr 


Leu 


He 


Trp 


His 


Ser 


Pro 


Trp 




115 






120 








125 


Thr 






Phe 


Phe 


Leu 


Asp Lys 


Glu 


Gly 


Asn 


Lys 


Met 


Val 


Asp 


Glu 


Asp 


Pro 




130 






135 








140 






Gl U 


-rU h 

Thr 


Lys 


Gin 


Arg 


Glu Lys 


Asn 


Lys 


Arg 


Leu 


Leu 


Leu 


Lys 


Arg 


Leu 


145 
HIS 




150 










155 




He 






160 


lie 


Lys 


Thr He 


val 


Lys 


Arg 


Tyr 


Lys 

170 


Asn 


Asp 


ser 


Ser 


Trp 






165 












17c 




Asp 


val 


val 


Asn Glu 


val 


val 


Asp 


ASp 


Asn 


Gly 


Lys 


Leu 


Arg 


Asn 


Ser 






180 






185 








190 


Phe 


Glu 


Pro 


Trp 


Tyr 


Gin lie 


Thr 


Gly 


Thr 


Asp 


Tyr 


lie 


Lys 


val 


Ala 




195 








200 










205 


He 






Thr 


Ala 


Asp 


Arg Tyr 


Ala 


Gly 


Lys 


Asn 


Ala 


Lys 


Leu 


Tyr 


Asn 


Asp 




210 




215 










220 








val 


Tyr 


Asn 


Thr 


Glu He 


Asp 


Pro 


Lys 


Arg 


Glu 


Thr 


Leu 


Tyr 


Asn 


Leu 


225 








230 








235 






Gly 


His 


240 


Lys 


Glu 


Leu 


val Lys 


Glu 


Gly 


val 


Pro 


val 


Asp 


Gly 


val 


Gin 






245 








250 








255 


Thr 


Ala 


His 


He 


Gin lie 


Gly 


Trp 


Pro 


Thr 


He 


Ala 


Glu 


He 


Glu 


Lys 








260 




265 










270 


GlU 




He 


Asn 


Met 


Phe Ala 


Asp 


Leu 


Gly 


Leu 


Asp 


Asn 


Gin 


lie 


Thr 


Leu 






275 






280 








285 




Thr 




Asp 


Val 


Ser 


Leu Tyr 


Gly 


Trp 


Pro 


Pro 


Lys 


Pro 


Ala 


Tyr 


Pro 


Tyr 


290 




295 










300 


val 








ASp 


Glu 


He 


pro Ala 


ser 


Glu 


Phe 


Glu 


Arg 


Gin 


Ala 


Arg 


Tyr 


Asp 


305 








310 








315 




He 






320 


Gin 


Leu 


Phe 


Asp Leu 


Tyr 


Glu 


Arg 


Leu 


Gly 


Asp 


Lys 


Ser 


Ser 


val 








325 






330 








335 


Ala 


Thr 


Phe 


Trp 


Gly val 


Ala 


ASP 


Asn 


His 


Thr 


Trp 


Leu 


Asn 


Asp 


Arg 






340 








345 










350 


Phe 




Glu 


Gin 


Tyr 


Asn Asp 


Gly 


val 


Gly 


val 


ASp 


Ala 


pro 


phe 


Val 


Asp 






355 






360 






He 


365 








Lys 


Asp 


Tyr 


Asn val 


Lys 


Pro 


Ala 


Tyr 


Trp 


Ala 


He 


Asp 


Arg 


Asp 


370 






375 










380 











<210> 115 
<211> 1362 
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<212> DNA 
<213> Unknown 

<220> . , - 

<223> Obtained from an environmental sample 

atgacgaacc gtaaatcgaa cgtgcaccgt tcattgaccg atgatttgct cgatggtgtc 



60 



ttcgccgagg caaaagcggg caaagttgag aagtaccgtg ccaccgggat ccttggaacg 120 

ctattcggat tcactgtggc gtcctccatc atgttggcgg cttgcagcaa cgcacaagag 180 

aatgttccac cagttgcttc atccaccgca cagagcaata tcacccagga gaacgttccg 240 

ccgctcaaag atgcgtttaa gggcaagttc ttgattggca ccgcggtgag caatcgcttg 300 

ctggagggac aagatccggc cacggaagcc ttggtgcgca ggcacttcga tgctctcacg 360 

gcggaaaacg ccatgaagcc ggatgcactg caaccgcgcg aaggccagtt caacttcgtc 420 

qccgccgacc gtctggtgga aatcgcccag caaagcggcg cgacagtggt cggccacacg 480 

rtggtctggc actcccaaac gccaggctgg ttcttccagg gtccgaatgg ccagccagcg 540 

agtcgagaac tggccctggc gcggatgcga acacacatca agacggtggt gggacgctac 600 

aaagggcgca tcaagcagtg ggatgtggtc aacgaagcga tcaacgacgg ccctggcgtg 660 

ctgcggcaaa gtccgtggct gcgtgccatc ggcgaagact acatcgccga agcgttccgc 720 

qccgcgcacg aagccgatcc tgacgccatt ctggtctaca acgactacaa catcgaactc 780 

aactacaagc gtcccaaggc gctggaactg ctaaagaagc tcatcgacca gaaggttccg 840 

attcatggtg tgggcattca ggctcactgg cgcatgaccc cgccgctggc cgagaccgaa 900 

gaagccatca aacagttcgc cgcgctgggc ctgaaggtga tgttcaccga actggacatc 960 

ggtgtgctgc ccactcagta tcagggggct gacatctcgg cgcgtgaaac catgacaccc 1020 

qaacagcaag cggtgatgaa cccttacact cagggcttgc cggctgaagt ggcacagcaa 1080 

catgccgagc gctaccgaca ggccttcgag ctgttcctgc gccacaagga tgtgattggt 1140 

cgcgtcacgc tctggggcac gcatgatggc gaatcctggc tgaacggttt tccggtgcgg 1200 

ggccgcaccg actatccctt gctcttcgac cgccggtatc agccaaaacc agccttcttc 1260 

gccgtcaggc aggttgcaca ggcgcatact gtacaaacga ccggtgcgca aacccaagct 1320 

acagcgaaga caattcaaaa agcttctcga gagtacttct ag ±i*>z 

<210> 116 
<211> 453 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 116 

Met Thr Asn Arg Lys Ser Asn Val His Arg ser Leu Thr Asp Asp Leu 
1 5 10 , , 15 

Leu Asp Gly Val Phe Ala Glu Ala Lys Ala Gly Lys Val Glu Lys Tyr 

20 25 30 

Arg Ala Thr Gly lie Leu Gly Thr Leu Phe Gly Phe Thr Val Ala ser 

35 40 45 

ser lie Met Leu Ala Ala Cys ser Asn Ala Gin Glu Asn val Pro Pro 

50 55 60 

val Ala ser ser Thr Ala Gin Ser Asn lie Thr Gin Glu Asn val Pro 
65 70 75 80 

pro Leu Lys Asp Ala Phe Lys Gly Lys phe Leu lie Gly Thr Ala Val 

85 90 , 95 

Ser Asn Arg Leu Leu Glu Gly Gin Asp Pro Ala Thr Glu Ala Leu Val 

100 105 HO 

Arg Arg His Phe Asp Ala Leu Thr Ala Glu Asn Ala Met Lys Pro Asp 

115 120 , 125 

Ala Leu Gin Pro Arg Glu Gly Gin Phe Asn Phe val Ala Ala Asp Arg 

130 " 135 140 . 

Leu Val Glu He Ala Gin Gin ser Gly Ala Thr val Val Gly His Thr 
145 150 155 160 

Leu val Trp His ser Gin Thr pro Gly Trp Phe Phe Gin Gly Pro Asn 

165 170 175 e 

Gly Gin Pro Ala ser Arg Glu Leu Ala Leu Ala Arg Met Arg Thr His 

180 185 1?0 

lie Lys Thr Val Val Gly Arg Tyr Lys Gly Arg He Lys Gin Trp Asp 

195 200 205 

Val val Asn Glu Ala lie Asn Asp Gly Pro Gly val Leu Arg Gin Ser 

210 215 220 

Pro Trp Leu Arg Ala lie Gly Glu Asp Tyr He Ala Glu Ala Phe Arg 
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225 230 235 240 

Ala Ala His Glu Ala Asp Pro Asp Ala lie Leu Val Tyr Asn Asp Tyr 

245 250 255 

Asn lie Glu Leu Asn Tyr Lys Arg Pro Lys Ala Leu Glu Leu Leu Lys 

260 265 270 

Lys Leu lie Asp Gin Lys val Pro lie His Gly val Gly lie Gin Ala 

275 280 285 

His Trp Arg Met Thr Pro Pro Leu Ala Glu Thr Glu Glu Ala lie Lys 

290 295 300 

Gin Phe Ala Ala Leu Gly Leu Lys Val Met Phe Thr Glu Leu Asp He 
305 310 315 320 

Gly val Leu Pro Thr Gin Tyr Gin Gly Ala Asp lie Ser Ala Arg Glu 

325 330 335 

Thr Met Thr Pro Glu Gin Gin Ala Val Met Asn Pro Tyr Thr Gin Gly 

340 345 350 

Leu Pro Ala Glu Val Ala Gin Gin His Ala Glu Arg Tyr Arg Gin Ala 

355 360 365 

Phe Glu Leu Phe Leu Arg His Lys Asp Val He Gly Arg Val Thr Leu 

370 375 380 

Trp Gly Thr His Asp Gly Glu ser Trp Leu Asn Gly Phe Pro Val Ar 
385 390 395 40 

Gly Arg Thr Asp Tyr Pro Leu Leu Phe Asp Arg Arg Tyr Gin Pro Lys 

405 410 415 

Pro Ala Phe Phe Ala Val Arg Gin Val Ala Gin Ala His Thr val Gin 

420 425 430 

Thr Thr Gly Ala Gin Thr Gin Ala Thr Ala Lys Thr He Gin Lys Ala 

435 440 445 

ser Arg Glu Tyr Phe 
450 

<210> 117 

<211> 1437 

<212> DNA 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 117 

atgacgaacc gtaaattgaa cgtgcaccgt tcattgagcg atgatttgct cgatggcgcc 60 

ttcgccgagt caaaagcggg caaagttgag aaataccgtg ccacggggat ccttggaacg 120 

ctattcggat tcactgtggc gtcctccatc atgttggcgg cttgcagcaa cgcacaagag 180 

aatgctccac cagttgcttc atccaccgca caaagcaata tcacccagga gaacgttccg 240 

ccgctcaagg atgcgtttaa gggcaagttc ttgattggca ccatcgcgag caatcgcttg 300 

ctgcagggac aagatccagc cacagaagcc ctggtgcgca ggcacttcga cgccctcacg 360 

gcggaaaatg ccatgaagcc tgatgccatg caacccagag agggtgagtt caactttgcc 420 

gccgctgacc gcctggtgga aatcgcccag caaagcggcg ccacggtggt cggccacacc 480 

ttggtctggc atagccaaac gccaagctgg ttcttccagg gtccagatgg ccaaccggcg 540 

agtcgggaac tggccttggc acggatgcga acgcacatca agactgtggt gggacgctac 600 

aaaggacgca tcaagcaatg ggatgtggtc aacgaagcga tcaacgacgg ccctggagtg 660 

ctgcggccat cgccgtggtt gcgcgccatc ggcgaagact tcatcgccga agcgttccgc 720 

gccgcgcacg aagctgatcc cgacgcgatt ctcgtctaca acgactacaa catcgagctc 780 

aactacaagc gtcccaaggc gctggaacta ctgaagagac tcatcgagca gaaggttccg 840 

attcatggtg tgggcattca ggctcactgg cgcatgaccc cgccgctggc cgagatggaa 900 

gagaccatca agcagttttc ggctttgggc ttgaaggtaa tgatcaccga gttggacatt 960 

ggtgtattgc caacacaata ccagggtgcc gacatctcgg ctcgcgagac catgacaccc 1020 

gaacagcaag cggtgatgaa cccttacacg cagggcttgc cggctgaagt ggcgcagcaa 1080 

catgccgagc gttatcgtca ggcgtttgag ctgttcatgc gttacaagga tgtgattggt 1140 

cgcgttaccc tgtggggcac gcatgatggc gaatcttggc tgaacggttt tcccgttcgt 1200 

ggccgcacgg attatcctct actgttcgac cgccggtatc agcctaagcc cgccttcttc 1260 

gcggtgcaaa aggtcgcgca ggcgcagaac gcacaggcag caaccgatca agcaccactt 1320 

gcacaaaacc cagttgcgca gaagaaatct gcaccaaggc aggcggctca aaatcagacc 1380 

actcaaaagc cagtggtaca aaagcaaagt gcggcaagtc gggccgcaga aaagtaa 1437 

<210> 118 
<211> 478 
<212> PRT 
<213> Unknown 
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<220> 

<223> obtained from an environmental sample 
<400> 118 

Met Thr Asn Arg Lys Leu Asn Val His Arg ser Leu Ser Asp Asp Leu 
15 10 15 

Leu Asp Gly Ala Phe Ala Glu Ser Lys Ala Gly Lys Val Glu Lys Tyr 

20 25 30 

Arg Ala Thr Gly lie Leu Gly Thr Leu Phe Gly Phe Thr Val Ala ser 

35 ' 40 45 

ser lie Met Leu Ala Ala Cys Ser Asn Ala Gin Glu Asn Ala Pro Pro 

50 55 60 

Val Ala Ser ser Thr Ala Gin ser Asn lie Thr Gin Glu Asn val Pro 
65 70 75 80 

Pro Leu Lys Asp Ala Phe Lys Gly Lys Phe Leu lie Gly Thr lie Ala 

85 90 95 

Ser Asn Arg Leu Leu Gin Gly Gin Asp Pro Ala Thr Glu Ala Leu val 

100 105 110 

Arg Arg His Phe Asp Ala Leu Thr Ala Glu Asn Ala Met Lys Pro Asp 

115 120 125 

Ala Met Gin Pro Arg Glu Gly Glu Phe Asn Phe Ala Ala Ala Asp Arg 

130 135 140 

Leu val Glu lie Ala Gin Gin Ser Gly Ala Thr val Val Gly His Thr 
145 150 155 160 

Leu val Trp His Ser Gin Thr Pro Ser Trp Phe Phe Gin Gly Pro Asp 

165 170 175 

Gly Gin Pro Ala Ser Arg Glu Leu Ala Leu Ala Arg Met Arg Thr His 

180 185 190 

lie Lys Thr val Val Gly Arg Tyr Lys Gly Arg lie Lys Gin Trp Asp 

195 200 205 

val val Asn Glu Ala lie Asn Asp Gly Pro Gly Val Leu Arg Pro ser 

210 215 220 

Pro Trp Leu Arg Ala lie Gly Glu Asp Phe lie Ala Glu Ala Phe Arg 
225 230 235 240 

Ala Ala His Glu Ala Asp Pro Asp Ala lie Leu val Tyr Asn Asp Tyr 

245 250 255 

Asn lie Glu Leu Asn Tyr Lys Arg Pro Lys Ala Leu Glu Leu Leu Lys 

260 265 270 

Arq Leu lie Glu Gin Lys val Pro lie His Gly val Gly He Gin Ala 

275 280 285 

His Trp Arg Met Thr Pro Pro Leu Ala Glu Met Glu Glu Thr lie Lys 

290 295 300 

Gin Phe Ser Ala Leu Gly Leu Lys val Met lie Thr Glu Leu Asp lie 
305 310 315 320 

Gly val Leu Pro Thr Gin Tyr Gin Gly Ala Asp lie Ser Ala Arg Glu 

325 330 335 

Thr Met Thr Pro Glu Gin Gin Ala Val Met Asn Pro Tyr Thr Gin Gly 

340 345 350 

Leu Pro Ala Glu val Ala Gin Gin His Ala Glu Arg Tyr Arg Gin Ala 

355 360 365 

Phe Glu Leu Phe Met Arg Tyr Lys Asp Val lie Gly Arg Val Thr Leu 

370 375 380 

Trp Gly Thr His Asp Gly Glu ser Trp Leu Asn Gly Phe Pro val Arg 
385 390 395 400 

Gly Arg Thr Asp Tyr Pro Leu Leu Phe Asp Arg Arg Tyr Gin Pro Lys 

405 410 415 

pro Ala Phe Phe Ala val Gin Lys val Ala Gin Ala Gin Asn Ala Gin 

420 425 430 

Ala Ala Thr Asp Gin Ala Pro Leu Ala Gin Asn Pro Val Ala Gin Lys 

435 440 445 

Lvs ser Ala Pro Arg Gin Ala Ala Gin Asn Gin Thr Thr Gin Lys Pro 

450 " 455 460 

val val Gin Lys Gin Ser Ala Ala Ser Arg Ala Ala Glu Lys 
465 470 475 

<210> 119 
<211> 2559 
<212> DNA 
<213> unknown 
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<220> 

<223> Obtained from an environmental sample 



<400> 119 

atgaaaaaaa 

cccatattca 

aaatttgact 

gcaaccgttt 

tcaacgtggg 

acagtttctt 

tatgtcaaag 

aacttttgga 

gtagaacttc 

gttattactt 

ggtagcactg 

tactatcatt 

gcacagatag 

gtatatcaaa 

gatggaacta 

acagaagtaa 

gaatcaccga 

ccacctgttg 

ggttttgttc 

ccaacaaaag 

atgacaagtt 

agtggaagtg 

agttatgatt 

ggttcataca 

attaattttg 

ccgattgcag 

tatttctcaa 

atggtgttaa 

caaagaacag 

caaaacggta 

ttcttccagc 

ttgagagata 

tatgcatggg 

agcgaatggt 

gtgcttgctt 

tacaacacag 

gatatgggtt 

gtaaaagata 

gtaacagagc 

caggatataa 

catagtgata 

tcaaaggata 

tactgggcaa 



gattgttagc 
ccacaccttt 
ttgaaaacgg 
acgagcaagc 
atggagcagt 
tatttgttcg 
ataacacagg 
agcagctctt 
ttgtatgtgt 
cagcacaacc 
agggttttgt 
ctccaacaaa 
atatgacaag 
atagtggaag 
cgagttatga 
caggttcata 
atattaattt 
taaacccagg 
agagaggttc 
cattgtatgt 
tgcttgagaa 
atcagaagat 
ctataaagta 
cagtgcctca 
acttctacct 
caaaagaacc 
taggtgttgc 
aacacttcaa 
aagggaactt 
ttggaattag 
acagtgatgg 
gattgaaaaa 
atgttgtaaa 
acagaatact 
tcaggtatgc 
agatatctaa 
taattgatgg 
tagaagatac 
ttgacataag 
tgataaaaca 
gagtcacaaa 
gaaataactg 
ttcaaaaagc 



gttgatagtg 
aacaaatgta 
tactcaaggt 
ttatgaagga 
tgtggatatc 
tcacagcgat 
cgaaaaatac 
tgggaagttc 
tccatctaac 
agcttcctcg 
tcagagaggt 
agcattatat 
tttgcttgag 
tgatcagaag 
ttctataaag 
cacagtgcct 
tgacttctac 
gcttgttaaa 
agcttcattg 
gacaggaagg 
gggcaaggat 
aacccttacg 
tcagcaaaca 
gacagcaaca 
tgatgacttt 
cgaatgggaa 
aataccgtat 
cagtataaca 
tacattcgat 
agggcatact 
aacttcactt 
tcatattcaa 
cgaggcaata 
tggtccaaca 
aagagaggcg 
aaaaagacag 
tgttggtttg 
aatcaatctt 
cgtttacaca 
agcaatgaag 
tgtgacactt 
gccattgctt 
ttctcgagag 



acattagttt 
gcaaaggctc 
tggggagcaa 
agttattctt 
acatcaagta 
gtaaaaccac 
atccaggttg 
acaatcacaa 
aaatctttag 
ggtgttgtta 
tcagcttcat 
gtgacaggaa 
aagggcaagg 
ataaccctta 
tatcagcaaa 
cagacagcaa 
cttgatgact 
tcttgcacat 
acagttgtcg 
acagctacat 
tatcagttta 
atgcaaagga 
gttccatctg 
cagcttatat 
acagtaatag 
attccgtcac 
aaagtacttc 
gctgaaaatg 
atagcagacc 
ctggtatggc 
gatccaagca 
actgttatgt 
gatgaaagcc 
cctgagacaa 
gatccggatg 
tttatatatg 
caagggcata 
ttctcaacaa 
agcagcagtc 
tttaaagaac 
tggggactta 
tttgacagca 
tacttctag 



ttattatctc 
aaagtaacca 
gaggtgtttc 
taaaggtttc 
tttcagcaaa 
aaagattttc 
cagacaaagt 
catcaaatcc 
gattttatct 
aatcttgcac 
tgacagttgt 
ggacagctac 
attatcagtt 
cgatgcaaag 
cagttccatc 
cacagcttat 
ttacagcggt 
ttgaaagcgg 
acggtgtata 
ggcagggtgc 
gcatatgggt 
agaatgaaga 
gtacatggac 
tctatgttga 
ataaaaatcc 
tttgtcagca 
aaaatcctgt 
agatgaaacc 
agtatgtaaa 
acagccaagt 
atccagatga 
caagatacaa 
agcctgatgg 
atggtattcc 
caaaactttt 
acatggtaaa 
taaatgttga 
ttcctggact 
agcgttatga 
tatttgaaat 
aggatgatta 
actaccaggc 



attgtttaat 
aacaaattta 
aacaactatt 
aggtagaagt 
tgtcacctat 
tgtctatgta 
ggttatgcca 
aattcaaaaa 
tgacaatgta 
atttgaaagc 
cgacggtgta 
atggcagggt 
tagcatatgg 
gaagaatgaa 
tggtacatgg 
attctatgtt 
tgacaaaaac 
tagcactgag 
ctatcattct 
acagatagat 
atatcaaaat 
tggaactacg 
agaagtaaca 
atcaccgaat 
agtgacggta 
atatagtcaa 
tgaaagagca 
tgacgctctg 
cttcgcacag 
acctaattgg 
taagcaactt 
agggaaagtc 
atttagaaga 
agaatacatt 
ctacaatgac 
aaagctacat 
ttctccaaca 
tgagatacag 
tacgcttcct 
gttaaagaga 
ttcatggctt 
aaaatacagc 



<210> 120 
<211> 852 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(33) 

<400> 120 .... 
Met Lys Lys Arg Leu Leu Ala Leu lie val Thr Leu val Phe lie lie 
15 10 15 

Ser Leu Phe Asn Pro lie Phe Thr Thr Pro Leu Thr Asn Val Ala Lys 

20 25 30 

Ala Gin Ser Asn Gin Thr Asn Leu Lys Phe Asp Phe Glu Asn Gly Thr 

35 40 45 

Gin Gly Trp Gly Ala Arg Gly val ser Thr Thr He Ala Thr val Tyr 
50 55 60 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2559 
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Glu Gin Ala Tyr Glu Gly Ser Tyr Ser Leu Lys Val Ser Gly Arg ser 
65 70 75 80 

Ser Thr Trp Asp Gly Ala val val Asp lie Thr ser ser lie ser Ala 

85 90 95 

Asn Val Thr Tyr Thr val ser Leu Phe val Arg His Ser Asp Val Lys 

100 105 n 110 

pro Gin Arg Phe ser val Tyr val Tyr val Lys Asp Asn Thr Gly Glu 

115 120 125 

Lys Tyr He Gin Val Ala Asp Lys Val val Met Pro Asn Phe Trp Lys 

130 135 , 140 

Gin Leu Phe Gly Lys Phe Thr He Thr Thr ser Asn Pro lie Gin Lys 
145 150 155 „ 160 

Val Glu Leu Leu val cys val Pro ser Asn Lys Ser Leu Gly Phe Tyr 

165 170 175 

Leu Asp Asn val val lie Thr ser Ala Gin Pro Ala Ser Ser Gly Val 

180 185 190 

val Lys Ser Cys Thr Phe Glu Ser Gly Ser Thr Glu Gly Phe Val Gin 

195 200 205 

Arg Gly Ser Ala Ser Leu Thr Val val Asp Gly val Tyr Tyr His Ser 

210 215 220 

Pro Thr Lys Ala Leu Tyr Val Thr Gly Arg Thr Ala Thr Trp Gin Gly 
225 250 „ 235 240 

Ala Gin lie Asp Met Thr Ser Leu Leu Glu Lys Gly Lys Asp Tyr Gin 

245 250 255 

Phe ser lie Trp val Tyr Gin Asn Ser Gly Ser Asp Gin Lys lie Thr 

260 265 270 

Leu Thr Met Gin Arg Lys Asn Glu Asp Gly Thr Thr ser Tyr Asp ser 

275 280 285 

lie Lys Tyr Gin Gin Thr val Pro Ser Gly Thr Trp Thr Glu Val Thr 

290 295 300 

Gly Ser Tyr Thr val Pro Gin Thr Ala Thr Gin Leu He Phe Tyr Val 
305 310 315 320 

Glu Ser Pro Asn lie Asn Phe Asp Phe Tyr Leu Asp Asp Phe Thr Ala 

325 330 335 

Val Asp Lys Asn Pro Pro val val Asn Pro Gly Leu val Lys ser Cys 

340 345 350 

Thr Phe Glu ser Gly Ser Thr Glu Gly Phe Val Gin Arg Gly ser Ala 

355 360 365 

Ser Leu Thr Val Val Asp Gly Val Tyr Tyr His Ser Pro Thr Lys Ala 

370 375 380 

Leu Tyr Val Thr Gly Arg Thr Ala Thr Trp Gin Gly Ala Gin lie Asp 
385 390 395 400 

Met Thr Ser Leu Leu Glu Lys Gly Lys Asp Tyr Gin Phe Ser lie Trp 

405 410 415 

val Tyr Gin Asn ser Gly Ser Asp Gin Lys lie Thr Leu Thr Met Gin 

420 425 430 

Arg Lys Asn Glu Asp Gly Thr Thr Ser Tyr Asp ser He Lys Tyr Gin 

435 440 445 

Gin Thr val Pro Ser Gly Thr Trp Thr Glu val Thr Gly ser Tyr Thr 

450 455 460 

Val Pro Gin Thr Ala Thr Gin Leu lie Phe Tyr val Glu Ser Pro Asn 
465 470 475 480 

He Asn Phe Asp Phe Tyr Leu Asp Asp Phe Thr Val lie Asp Lys Asn 

485 490 495 

Pro val Thr val pro lie Ala Ala Lys Glu Pro Glu Trp Glu lie pro 

500 505 510 

Ser Leu Cys Gin Gin Tyr ser Gin Tyr Phe ser lie Gly val Ala lie 

515 520 525 

Pro Tyr Lys val Leu Gin Asn Pro val Glu Arg Ala Met val Leu Lys 

530 535 _ 540 

His Phe Asn ser lie Thr Ala Glu Asn Glu Met Lys Pro Asp Ala Leu 
545 550 555 560 

Gin Arg Thr Glu Gly Asn Phe Thr Phe Asp He Ala Asp Gin Tyr Val 

565 570 575 

Asn Phe Ala Gin Gin Asn Gly He Gly lie Arg Gly His Thr Leu val 

580 585 590 

Trp His Ser Gin val Pro Asn Trp phe Phe Gin His Ser Asp Gly Thr 

595 600 60S 

Ser Leu Asp pro ser Asn Pro Asp Asp Lys Gin Leu Leu Arg Asp Arg 
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610 










615 










620 




Glv 




Val 


1 Alt 1 \/C 


Moll 


n 1 o 


Tie 


fil n 


Thr 
1 1 1 1 


Val 

va i 


MPT 

IMG u 


Ser 


Arg 


Tvr 

i y i 


LVS 


LVS 










630 










635 


Gl u 




Gin 




640 


Twr» Ala 
1 yr A 1 d 


i rp 


a en 
Mb \J 


Val 

Yd 1 


Yd I 


Acn 
nil 1 
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rile 


iyr 


Acn 
A jll 


Acn 


Tv/r 

iyr 


Acn 


Thr 
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_ 
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Yd 1 
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Lys 
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Gin 
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Met 


Lys 


Phe 


Lys 
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Leu 


Phe 


Glu 


Met 


Leu 


Lys 


Arg 


785 








790 








795 


Gly 








800 


His Ser 


Asp 


Arg 


val 


Thr 


Asn 


Val 


Thr 


Leu 


Trp 


Leu 


Lys 


Asp 


Asp 




805 










810 










815 




Tyr Ser 


Trp 


Leu 


Ser 


Lys 


Asp 


Arg 


Asn 


Asn 


Trp 


Pro 


Leu 


Leu 


Phe 


Asp 


820 




825 






He 


Gin 
845 


830 


Ala 




ser Asn 


& 


Gin 


Ala 


Lys 


Tyr 


Ser 
840 


Tyr 


Trp 


Ala 


Lys 


Ser 



Arg Glu Tyr Phe 
850 



<210> 121 
<211> 1905 
<212> DMA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 121 

atgaagcata tttttattgt attaattgtt tccctgctgt ttagcttegg gggatatget 60 

caacaaacca ttagcagagc teegcagggg tttgaccagc aacgtgccgg cattgcatcc 120 

ggtaaagttg aaategtaac ctataaatcg aaaacegtag gagtgaatcg ctctgcacgt 180 

gtttatacac cagceggatt ctcaaaaaag aagaaatatc ctgtgcttta tttattacat 240 

ggcattggag gcgacgaaga tgagtggtac aaaaacggcg ttcctcatat tattttcgac 300 

aacctgattg ccgacggcaa aatggaaccg atgattgtgg tactgcccaa tggtcgcgcc 360 

atgaaaaacg accgtgccga aggaaatatt ttcgacaaag agaaagttga agectttgea 420 

acattcgaaa aagacctttt aaacgattta atacegttta tcgaaaaaaa ataccctgta 480 

ttaaaaaccc gtgagtttcg cgccattgca ggattatcaa tgggeggegg acaatcgctc 540 

aattttggac tgggaaatct cgacaaattt gcatgggtag geggctttte atcggccccc 600 

aataccaaaa tgcccgctga gttggttcca aacactcaaa aggcaacaga aatgcttaag 660 

ttgctttatg tgtcttgtgg cgataaagac aatttaatgc aggttagtca gcgcacccac 720 

gattatctga aagccaataa agtacctcat attttcaggg ttattcctga tggttaccac 780 

gattttaatg tttggaaaga cgatttgtat cattaegtae aaatgctgtt taagcctgtg 840 

gtaatgcccg tagcagcagc tactttaaaa gatgettata aagggaaatt cttcattgga 900 

actgccctta atacccctca aattttgggt accgctgttg atgaagtgaa tattgttaaa 960 

acccatttca actccattgt tgecgaaaac tgtatgaaga gtggcccgat gcaaccacaa 1020 

gaagggaaat ttgagtttga cctggccgat aagtttgtag agtttggagt taaaaacaat 1080 

atgeagatta ttggtcatac gcttatctgg cattegcagg caccccgctg gttttttacc 1140 

gaeagegaag geaaggaegt atcgcccgag gtgcttaccg agegcatgaa aaaccatatc 1200 

tatactgttg ttggccgtta caaaggcaag gtgeaeggat gggatgtggt gaatgaagee 1260 

atagttgacg atggcagcta ccgaaacagt aaattctacc aaatactggg cgaagatttt 1320 

atcaaactgg cattccagtt tgctcatgaa gccgaccccg atgeagaatt gtactacaac 1380 

gattattccg aatttgttcc tgecaaaaga gaaggcattg cccgcatggt gaagaaactc 1440 

aaagaccagg gcattagaat egaeggegtt ggatttcagt gecatattgg cctcgattat 1500 

ccaggcctgg atgaatacga aaaaaccatt caattaattg ccaacgaggg ggtaaaagta 1560 

atgataaccg aaatggaaat ateggtatta cccatgcccg actggcgcgt tggtgctgag 1620 

attteggeca gtttcgaata tcaacagaaa ttaaatccct acaccgaagg attgeccgat 1680 
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tcagtgaatg ctcaattaga acagcgttat gtcgactttt tcacgctctt ccttaaatat 1740 

cacgaagtga ttccaagagt tacggtttgg ggggttaacg atggcaactc atggaaaaac 1800 

ggattcccgg tgcgtggaag aaccgactac ccattgttat tcgaccggaa aaatcagcct 1860 

aaatcagctg ttgccaaatt aattgaactg gctaatacaa agtag 1905 

<210> 122 
<211> 634 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...C20) 

<400> 122 i_ l 

Met Lys His lie Phe lie Val Leu lie Val ser Leu Leu Phe Ser Phe 

1 5 10 15 

Gly Gly Tyr Ala Gin Gin Thr lie ser Arg Ala Pro Gin Gly Phe Asp 

20 25 30 

Gin Gin Arg Ala Gly He Ala ser Gly Lys Val Glu lie Val Thr Tyr 

35 40 n 45 

Lys ser Lys Thr val Gly val Asn Arg ser Ala Arg Val Tyr Thr Pro 

50 55 60 

Ala Gly Phe Ser Lys Lys Lys Lys Tyr Pro Val Leu Tyr Leu Leu His 
65 70 75 80 

Gly lie Gly Gly Asp Glu Asp Glu Trp Tyr Lys Asn Gly val Pro His 

85 90 95 

lie lie Phe Asp Asn Leu lie Ala Asp Gly Lys Met Glu Pro Met lie 

100 105 110 

val val Leu Pro Asn Gly Arg Ala Met Lys Asn Asp Arg Ala Glu Gly 

115 120 125 

Asn lie Phe Asp Lys Glu Lys Val Glu Ala Phe Ala Thr Phe Glu Lys 

130 135 140 

Asp Leu Leu Asn Asp Leu He Pro Phe lie Glu Lys Lys Tyr Pro val 
145 150 155 n 160 

Leu Lys Thr Arg Glu Phe Arg Ala lie Ala Gly Leu Ser Met Gly Gly 

165 170 175 

Gly Gin Ser Leu Asn Phe Gly Leu Gly Asn Leu Asp Lys Phe Ala Trp 

180 185 190 

Val Gly Gly Phe Ser Ser Ala Pro Asn Thr Lys Met Pro Ala Glu Leu 

195 200 205 

Val Pro Asn Thr Gin Lys Ala Thr Glu Met Leu Lys Leu Leu Tyr val 

210 215 220 

Ser cys Gly Asp Lys Asp Asn Leu Met Gin Val Ser Gin Arg Thr His 
225 230 235 240 

Asp Tyr Leu Lys Ala Asn Lys val Pro His lie Phe Arg Val lie Pro 

H 245 250 255 

Asp Gly Tyr His Asp Phe Asn Val Trp Lys Asp Asp Leu Tyr His Tyr 

260 265 270 

Val Gin Met Leu Phe Lys Pro val val Met Pro Val Ala Ala Ala Thr 

275 280 285 

Leu Lys Asp Ala Tyr Lys Gly Lys Phe Phe lie Gly Thr Ala Leu Asn 

290 295 300 

Thr Pro Gin lie Leu Gly Thr Ala Val Asp Glu Val Asn lie Val Lys 
305 310 315 ^ 320 

Thr His Phe Asn Ser He Val Ala Glu Asn Cys Met Lys ser Gly Pro 

325 330 335 

Met Gin Pro Gin Glu Gly Lys Phe Glu Phe Asp Leu Ala Asp Lys Phe 

340 345 350 

Val Glu Phe Gly val Lys Asn Asn Met Gin lie lie Gly His Thr Leu 

355 360 365 n , 

He Trp His Ser Gin Ala Pro Arg Trp Phe Phe Thr Asp Ser Glu Gly 

370 375 380 . 

Lys Asp Val Ser Pro Glu Val Leu Thr Glu Arg Met Lys Asn His He 
385 390 395 400 

Tyr Thr val Val Gly Arg Tyr Lys Gly Lys val His Gly Trp Asp Val 
405 410 415 
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val Asn Glu Ala He val Asp Asp Gly ser Tyr Arg Asn Ser Lys Phe 

420 425 430 

Tyr Gin He Leu Gly Glu Asp Phe He Lys Leu Ala Phe Gin Phe Ala 

435 440 445 

His Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr Asn Asp Tyr ser Glu 

450 455 460 

Phe val Pro Ala Lys Arg Glu Gly He Ala Arg Met Val Lys Lys Leu 
465 470 475 480 

Lys Asp Gin Gly lie Arg lie Asp Gly Val Gly Phe Gin Cys His lie 

485 490 495 

Gly Leu Asp Tyr Pro Gly Leu Asp Glu Tyr Glu Lys Thr lie Gin Leu 

500 505 510 

lie Ala Asn Glu Gly val Lys val Met He Thr Glu Met Glu lie ser 

515 520 525 

Val Leu Pro Met pro Asp Trp Arg Val Gly Ala Glu lie Ser Ala Ser 

530 535 * 540 

phe Glu Tyr Gin Gin Lys Leu Asn Pro Tyr Thr Glu Gly Leu Pro Asp 
545 550 555 560 

Ser Val Asn Ala Gin Leu Glu Gin Arg Tyr val Asp Phe Phe Thr Leu 

565 570 575 

Phe Leu Lys Tyr His Glu val lie pro Arg val Thr val Trp Gly val 

580 585 590 

Asn Asp Gly Asn Ser Trp Lys Asn Gly Phe Pro val Arg Gly Arg Thr 

595 600 605 

asd Tyr Pro Leu Leu Phe Asp Arg Lys Asn Gin Pro Lys ser Ala val 

610 615 620 

Ala Lys Leu He Glu Leu Ala Asn Thr Lys 
625 630 

<210> 123 
<211> 1200 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

atgatcgttg gattctcgtt tatgctgctg cttcctttag ggatgacgaa tgcattggca 60 

aaaacggaac cagcgtacgc taaaaagccg cgaatcagcg cattgcacgc ccctcaattg 1Z0 

gatcagcgct acaaagattc cttcactatt ggggcggccg ttgaacctta tcagttgcaa 180 

aacgaaaaag acgtccaaat gctgaaacgc cattttaaca gcattgtcgc tgagaacgtt 240 

atgaaaccga tcaacatcca acccgaagaa ggaaagttca attttgctga ggcggatcaa 300 

atcgtccgat ttgctaaaaa acatcatatg gatattcgtt tccatacact cgtttggcac 360 

agccaagtac ctcaatggtt ctttcttgac aaggaaggca agccgatggt caatgaaacg 420 

gatccggcaa agcgcgaaca aaataaacag ctgttactga aacggctcga aatccatatt 480 

aaaacgattg tcgaacggta taaagacgac atcaaatatt gggacgtcgt gaacgaggta 540 

gtcggggatg atggaaaatt gcgcaattcc ccgtggtatc aaatcgccgg catcgattat 600 

atcaaggtag cattccaaac ggcgagaaca tatggcggca acaagattaa actgtacatc 660 

aacgattaca ataccgaagt ggaaccgaag cgaagcgctc tttataactt agtgaaacaa 720 

ttaaaagaag aaggcgttcc cattgacggg attggccacc agtcccacat ccaaattggc 780 

tggccttctg aagaagaaat cgaaaaaacg atcaacatgt ttgccgatct agggttagac 840 

aatcaaatta cggagctgga tgtgagcatg tacggctggc cgccgcgcgc ctacccgtcg 900 

tatgacgcca ttccggaaca aaagtttttg gaccaagcgg ctcgctatga ccgattgttt 960 

aagctgtacg aaaaacttgg cgataaaatc agcaacgtca ccttctgggg catcgccgac 1020 

aaccatacgt ggctcgacag ccgtgcggat gtgtactatg acgccaacgg gaatgttgtg 1080 

gttgacccga acgctccgta cgcaaaagtg gaaaaaggga aaggaaaaga tgcgccgttt 1140 

ctgttcgacc ccgaatacca cgtaaaacct gcgtattggg ccattatcga tcataagtga 1200 

<210> 124 
<211> 399 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(20) 
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<400> 124 

Met lie val Gly Phe ser Phe Met Leu Leu Leu Pro Leu Gly Met Thr 

15 10 15 

Asn Ala Leu Ala Lys Thr Glu Pro Ala Tyr Ala Lys Lys Pro Arg lie 

20 25 30 

ser Ala Leu His Ala Pro Gin Leu Asp Gin Arg Tyr Lys Asp Ser Phe 

35 40 45 

Thr He Gly Ala Ala Val Glu Pro Tyr Gin Leu Gin Asn Glu Lys Asp 

50 55 60 

val Gin Met Leu Lys Arg His Phe Asn Ser lie Val Ala Glu Asn val 
65 70 75 80 

Met Lys Pro lie Asn lie Gin Pro Glu Glu Gly Lys Phe Asn Phe Ala 

85 90 95 

Glu Ala Asp Gin lie Val Arg Phe Ala Lys Lys His His Met Asp lie 

100 ~ 105 110 

Arg Phe His Thr Leu Val Trp His Ser Gin val Pro Gin Trp Phe Phe 

115 120 125 

Leu Asp Lys Glu Gly Lys Pro Met val Asn Glu Thr Asp Pro Ala Lys 

130 135 140 

Arg Glu Gin Asn Lys Gin Leu Leu Leu Lys Arg Leu Glu lie His lie 
145 150 155 160 

Lys Thr lie Val Glu Arg Tyr Lys Asp Asp lie Lys Tyr Trp Asp Val 

165 170 175 

Val Asn Glu val Val Gly Asp Asp Gly Lys Leu Arg Asn ser Pro Trp 

180 185 190 

Tyr Gin lie Ala Gly lie Asp Tyr lie Lys val Ala Phe Gin Thr Ala 

195 200 205 

Arg Thr Tyr Gly Gly Asn Lys lie Lys Leu Tyr lie Asn Asp Tyr Asn 

210 215 220 

Thr Glu Val Glu Pro Lys Arg Ser Ala Leu Tyr Asn Leu val Lys Gin 
225 230 235 240 

Leu Lys Glu Glu Gly val Pro lie Asp Gly lie Gly His Gin ser His 

245 250 255 

He Gin lie Gly Trp Pro ser Glu Glu Glu lie Glu Lys Thr lie Asn 

260 265 270 

Met Phe Ala Asp Leu Gly Leu Asp Asn Gin lie Thr Glu Leu Asp Val 

275 280 285 

Ser Met Tyr Gly Trp Pro Pro Arg Ala Tyr Pro Ser Tyr Asp Ala lie 

290 295 300 

Pro Glu Gin Lys Phe Leu Asp Gin Ala Ala Arg Tyr Asp Arg Leu Phe 
305 310 315 320 

Lys Leu Tyr Glu Lys Leu Gly Asp Lys lie Ser Asn Val Thr Phe Trp 
y 325 330 335 

Gly lie Ala Asp Asn His Thr Trp Leu Asp Ser Arg Ala Asp Val Tyr 

340 345 350 

Tyr Asp Ala Asn Gly Asn Val Val Val Asp Pro Asn Ala Pro Tyr Ala 

355 360 365 

Lys val Glu Lys Gly Lys Gly Lys Asp Ala Pro Phe Leu Phe Asp Pro 

370 375 380 

Glu Tyr His Val Lys Pro Ala Tyr Trp Ala lie lie Asp His Lys 
385 390 395 

<210> 125 
<211> 1089 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 125 

atgttgacga ccccgacaac tcaagatcat gtccccgtgc tcaaggacgc tttcaaaggc 60 
aagctcctca ttggagccgt gctcggttac gatgctctcc aggggaagga cccgctgagt 120 
gagaaaattg cgaccactca cttcgatgct ctcactgctg aaaacagcat gaagccggct 180 
ctcgtgcaac ccaaagaggg cgagtttgat ttcgctgatg gagatcgtct ccttgaaatc 240 
gcgcagcaat gcggcgctac tgcaatcggc catactctgc tctggcacca acaaacgcca 300 
cgctggtttt ttgaagggcc agatggtcag cctgctgacc gtgagttggc cctggcacgc 360 
atgaggaagc acatttccac tctcgttggt cgctataaag gtcgcattaa acaatgggat 420 
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gtggtgaatg aggcgattag cgatgcagag ggcgagtact taagaccaaa gagcccctgg 480 

tttlaigccg ttggagagga tcacatcgcg catgctttcc aggcagcaca tgaagctgat 540 

cccgatgcc! tccttitcta taacgactac aacatcgagc aggagtacaa gcgcccgaag 600 

gcgatacgcc tactgaggtc attacttgag caggacgttc ccattcatgc cgtgggcatt 660 

Eaaggccitt ggcgtatgga cactctgaat gttgccgaaa tcgaagaagc tatcgaagaa 720 

tttqctqcgc tgggtctcia ggtcatgatc accgagcttg atatcagcgt gctaccgaca 780 

aagtat?agg glgccgatct cgctactcgg gaagaattga cgcctgaaat caatccctat 840 

acggaggaac tacctgagga cgttgcccgg caacatgccg agtgttatcg gcaggtcttc 900 

gaaligttcc tgcgccacaa ggatgccatt agccgtgtca cgctctgggg cattcacgat 960 

aacaqatcat ggttcaacaa ctttccggtc agggggcgca cagactatcc tctgctattc 102U 

glSgggaat gtaaccccaa gccagcgttt ttcgccgtct tgaaagctgc gcaagaccag 1080 
ccacaatga 

<210> 126 
<211> 362 
<212> PRT 
<213> Unknown 

<220> . . , 

<223> Obtained from an environmental sample 

Met°Leu 2 Thr Thr Pro Thr Thr Gin Asp His Val Pro Val Leu Lys Asp 

1 5 10 I 5 , 

Ala Phe Lys Gly Lys Leu Leu lie Gly Ala Val Leu Gly Tyr Asp Ala 

20 25 30 

Leu Gin Gly Lys Asp Pro Leu ser Glu Lys lie Ala Thr Thr His Phe 

35 40 45 

Asp Ala Leu Thr Ala Glu Asn ser Met Lys Pro Ala Leu val Gin Pro 

50 55 "0 

Lys Glu Gly Glu Phe Asp Phe Ala Asp Gly Asp Arg Leu Leu Glu lie 
65 70 75 t>V 

Ala Gin Gin cys Gly Ala Thr Ala He Gly His Thr Leu Leu Trp His 

85 90 95 

Gin Gin Thr Pro Arg Trp Phe Phe Glu Gly Pro Asp Gly Gin Pro Ala 

100 105 . , HO 

Asp Arg Glu Leu Ala Leu Ala Arg Met Arg Lys His lie ser Thr Leu 

115 120 125 

Val Gly Arg Tyr Lys Gly Arg lie Lys Gin Trp Asp val Val Asn Glu 

130 135 140 

Ala He ser Asp Ala Glu Gly Glu Tyr Leu Arg Pro Lys ser Pro Trp 
145 150 155 160 

Phe Lys Ala Val Gly Glu Asp His He Ala His Ala Phe Gin Ala Ala 

165 170 175 

His Glu Ala Asp Pro Asp Ala lie Leu lie Tyr Asn Asp Tyr Asn lie 

180 185 190 

Glu Gin Glu Tyr Lys Arg Pro Lys Ala He Arg Leu Leu Arg Ser Leu 

195 ' 200 205 

Leu Glu Gin Asp Val Pro He His Ala val Gly lie Gin Gly His Trp 

210 215 220 

Arg Met Asp Thr Leu Asn Val Ala Glu lie Glu Glu Ala lie Glu Glu 
225 230 235 , 240 

Phe Ala Ala Leu Glv Leu Lys val Met lie Thr Glu Leu Asp ll| Ser 

Val Leu Pro Thr Lys Tyr Gin Gly Ala Asp Leu Ala Thr Arg Glu Glu 

260 265 270 

Leu Thr Pro Glu lie Asn Pro Tyr Thr Glu Glu Leu Pro Glu Asp val 

275 280 285 

Ala Arg Gin His Ala Glu Cys Tyr Arg Gin val Phe Glu Met Phe Leu 

290 295 300 

Arg His Lys Asp Ala He ser Arg val Thr Leu Trp Gly lie His Asp 
305 310 315 

Glv Arq ser Trp Phe Asn Asn Phe Pro Val Arg Gly Arg Thr Asp Tyr 

325 330 335 

pro Leu Leu Phe Asp Arg Glu Cys Asn Pro Lys Pro Ala Phe Phe Ala 

340 345 350 

val Leu Lys Ala Ala Gin Asp Gin Pro Gin 
355 360 
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<210> 127 
<213> 960 
<212> DNA 
<213> unknown 

<220> . _ . 

<223> Obtained from an environmental sample 

gtggatctcg ctgagaaatg cggcatatat attggtgcag cggttgaacc cggatattta 60 

ittatcaggg aatacgctga gattttatcc cgcgaattta acgtggtaac cgcggaaaat 120 

gcattaaiit ttgaagctat tcatccgcag cgtggagtat attcatttga aggtgcagat 180 

gcaatagttc gatttgcaga aactcatgga atgaaggttc gtggacatac acttgtttgg 240 

caccagcagc ttcctgcatg gataacttct ggaagttacg cttgggagga gtggaagaat 300 

attctccgtg agcatgtaat gagcgttgtt ggacgatata agggccaaat atatgcatgg 360 

gatgtggtta acgaagcaat attagataac ggttcattaa gagataatgt ttggtttaga 420 

aatgtaggtc cagaatatat tgagtcagcc tttagatggg ctcatgaagc tgacccaaac 480 

gctcttctct tctataatga ttatgaagct gaggacttga atgataagtc gcatgctgtt 540 

tataacctgg ttaagagttt acttgagaaa ggtgtaccga tacatggcgt aggattacag 600 

atgcatatta acgtagaaaa tccgccgaaa ccggaagatg ttgcagcaaa cattaaacgt 660 

ctaaatgatc tgggcttgat tgtccacata acggaaatgg atgtgcgcat tagaacccca 720 

ccatcaaatg aagatctcat taaacaagca gaaatttacc gtgatatatt aagagtttgt 780 

ctttcatcag aaaaatgcac agcattcatt atgtggggat ttactgaccg ctattcatgg 840 

ataccaaatt acttcagcgg ctacggttca gctttaatat tcgatgagca atataagccc 900 

aaactagcat attactatat acttcggaca ttcatcgaaa aactaggcat taaaggttaa 960 

<210> 128 
<211> 319 
<212> PRT 
<213> Unknown 

<220> . . .. 

<223> Obtained from an environmental sample 

val^Asj^Leu Ala Glu Lys Cys Gly He Tyr He Gly Ala Ala yal Glu 
1 5 10 15 

pro Gly Tyr Leu lie He Arg Glu Tyr Ala Glu He Leu ser Arg Glu 

20 25 30 

Phe Asn val val Thr Ala Glu Asn Ala Leu Lys Phe Glu Ala lie His 

35 40 45 

pro Gin Arg Gly val Tyr ser Phe Glu Gly Ala Asp Ala lie Val Arg 

50 55 60 

Phe Ala Glu Thr His Gly Met Lys Val Arg Gly His Thr Leu val Trp 
65 70 75 80 

His Gin Gin Leu Pro Ala Trp He Thr ser Gly ser Tyr Ala Trp Glu 

85 90 95 

Glu Trp Lys Asn lie Leu Arg Glu His val Met Ser val Val Gly Arg 

100 105 110 

Tyr Lys Gly Gin lie Tyr Ala Trp Asp Val Val Asn Glu Ala He Leu 

115 120 125 

Asp Asn Gly ser Leu Arg Asp Asn Val Trp Phe Arg Asn val Gly Pro 

130 135 140 

Glu Tyr lie Glu Ser Ala Phe Arg Trp Ala His Glu Ala Asp Pro Asn 
145 150 155 160 

Ala Leu Leu Phe Tyr Asn Asp Tyr Glu Ala Glu Asp Leu Asn Asp Lys 

Ser His Ala val Tyr Asn Leu val Lys Ser Leu Leu Glu Lys Gly Val 

180 185 , 190 

pro He His Gly val Gly Leu Gin Met His lie Asn val Glu Asn Pro 

195 200 205 

pro Lys Pro Glu Asp Val Ala Ala Asn He Lys Arg Leu Asn Asp Leu 

210 215 , 220 

Gly Leu He val His He Thr Glu Met Asp val Arg lie Arg Thr Pro 
225 230 235 240 

pro Ser Asn Glu Asp Leu He Lys Gin Ala Glu He Tyr Arg Asj3 He 

Leu Arq Val cys Leu Ser ser Glu Lys Cys Thr Ala Phe He Met Trp 
260 265 270 
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Gly phe Thr Asp Arg Tyr Ser Trp He Pro Asn Tyr Phe ser Gly Tyr 

275 280 285 

Gly ser Ala Leu lie Phe Asp Glu Gin Tyr Lys Pro Lys Leu Ala Tyr 

290 295 300 

Tyr Tyr lie Leu Arg Thr Phe lie Glu Lys Leu Gly lie Lys Gly 
305 310 315 

<210> 129 
<211> 3021 
<212> DNA 
<213> Bacteria 



<400> 129 

atggtaataa 

gacggtggtg 

actttcaaac 

attgcaacaa 

cccaaaactg 

tttgacgatg 

aacgctgtgc 

ggaacggtaa 

tgtgaaaatg 

atcagggttt 

gccgtcagag 

aatttccagg 

tacataaaaa 

tatgacgttg 

gcgagggaac 

aaatttattg 

tactacaacg 

gcaaacttgt 

gatatgaatg 

atcggttgtg 

ttacagcagc 

acctccagca 

tggctcggtt 

tacaatgcgg 

ggcggcggag 

acttttgaag 

ggaagaacgg 

aacggagcac 

agcgtagtgg 

tacgtagacg 

aatcagtggg 

tatgtttatg 

gcggttgccg 

cttggcgatg 

agcgtgttga 

aatggatcga 

gacaaatttc 

tatcggccgg 

gaaacatata 

gatccgaaca 

acgattttca 

gaacttgagc 

aacttttatc 

tatgcagaat 

ggattctcaa 

gcatatttta 

aattcaattg 

gcggccaccg 

aaggctttgc 

gtagctccgg 

ccatatttct 



atcgctccag 
tagaatacaa 
tttctgtgtc 
aggatgttgt 
cagtgaatat 
taaccataac 
tgaaagatat 
acaattcatc 
aaatgaagcc 
ctcttaatcg 
gtcatacact 
acaacggaaa 
atatgtttgc 
taaatgaggc 
ctggatacgg 
agaaagcatt 
attacaacga 
acaacaaggg 
gattctcagg 
atgtccaaat 
aggctgataa 
aaggaaaggt 
cacaaaatgc 
ttgcatccat 
gaggaggcaa 
gaagcgtagg 
cttacaaagg 
aacgggcgct 
catcgtttat 
gaagcggcac 
ttcacctgta 
tggaaacagc 
gaactgtaat 
taaacggtga 
gggcaatcac 
taaacagcac 
ctgtagcaga 
ctcctgattc 
caggaataaa 
aaaaatataa 
gcaacgatgt 
ctttgattgt 
aggaattcag 
caacaacccc 
tgggaggatt 
tgcctttaag 
ctgaagcaat 
gttccgagga 
cgcattttga 
gcgccactca 
tccatgaatg 



tgcgagtgac 
gtacagtgtt 
ctatttggat 
ggccggagaa 
tactttgtca 
ccgtaaagga 
gtatgcaaac 
aataaaggcc 
tgatgccaca 
tgcagcaagt 
ggtttggcac 
ctgggtttcc 
tgaaatccaa 
agtaagtgat 
aaatggtaga 
tacatatgca 
atattgggat 
cttgcttgac 
tatacaaaat 
taccgagctt 
atataaagct 
tacggctgtc 
acctcttttg 
tattcctcag 
accggaagag 
acagtggaca 
ttcagaatca 
gaatcccaga 
tgaaggtgcg 
tcaacggtat 
caatccgcaa 
ggatgacacc 
cgaaggacct 
tggaaccatt 
ccttaccgac 
tgatgtttta 
aaatccttct 
ttatttaaac 
cggaactaag 
cattttctac 
taaattgcaa 
agtaacaccc 
gcaaaatgtc 
acagggaata 
gacaacatgg 
cggtgactac 
taacagatcc 
tattgcatat 
ttatacttcg 
ctggtgggga 



ggtgcgtatt 
tttgtaaaac 
tcggaaacag 
tggactgaga 
attacaaccg 
atggctgagg 
tatttcagag 
ttgattttaa 
ctggttcaat 
attttaaact 
agccagacac 
caatcagtta 
agacagtatc 
gatgcaaaca 
tctccatggg 
agaaaatatg 
cataagagag 
ggtgtgggaa 
tataaagcag 
gatattagta 
gttttccagg 
tgtgtatggg 
tttaacgcaa 
tccgaatggg 
ccggatgcaa 
gccagaggac 
ctcttggtaa 
acgtttgttc 
tcttccacaa 
gataccatag 
tacagaattc 
attaacttct 
gctccacagc 
aactcaactg 
gatgcaaagg 
cttctttcac 
tcttctttta 
ccttgtccgc 
agtcttaatg 
cttatgcatg 
aatatccttg 
actttcaacg 
attccttttg 
gccgcttcaa 
tatgtaatgg 
tggtatggaa 
ggactttcaa 
gctaatatga 
gatttttcca 
tacgtaagac 



cggaaaaagg 
acaacgggac 
aagaagaaaa 
tttcggcaaa 
acagcactgt 
caaacacagt 
ttggttcggt 
gagagtttaa 
caggatcaac 
tctgtgcaca 
ctcaatggtt 
tggaccagcg 
cgtctttgaa 
ggaccagata 
ttcagatcta 
ctccggcaaa 
actgtattgc 
tgcagtccca 
ctttgcagaa 
cagaaaacgg 
cagctgttga 
gacctaatga 
acaatcaacc 
gcgacggtaa 
acggatatta 
ctgcggaagt 
ggaaccgtac 
ccggaaacac 
cattctgcat 
atatgaaaac 
cttccgatgc 
acatagatga 
ctacacagcc 
acttgacaat 
ctagagcaga 
gctacctttt 
aatatgagtc 
aggcgggaag 
tatatcttcc 
gcggcggtga 
accacgcgat 
gcggaaactg 
tggaaagcaa 
gaatgcacag 
ttaactgcct 
acagtccgca 
agagggagta 
atcctcaaat 
aaggtaattt 
attatattta 



tttctatctc 
cggcaccgaa 
taaggaagta 
atacaaagca 
agatttcatt 
atatgcagca 
acttaactcc 
cagtattacc 
caatacaaat 
aaataatata 
tttcaaagac 
tttggaaagc 
tctttatgcc 
ttatggcggg 
cggagacaac 
ttgtaagctt 
ctcaatttgt 
tattaatgcg 
atatataaat 
caaatttagc 
tataaacaga 
cgccaatact 
gaaaccggca 
caatccggcc 
ttatcatgac 
tctgcttagc 
ggcagcatgg 
atattgtttc 
gaagctgcaa 
tgtgggtcca 
aacagatatg 
ggcaatcgga 
tccggtactg 
gttaaagaga 
cgttgacaag 
aagagtaatc 
ggccgtgcaa 
aattgtcaag 
atacggttat 
aaatgagaat 
tatgaacggt 
cacggcccaa 
gtactctact 
aggtttcggc 
tgattacgtt 
ggataaggct 
tttcgtattt 
tgaagctatg 
ttactttctt 
tgatgcactt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3021 



<210> 130 
<211> 1006 
<212> PRT 
<213> Bacteria 
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<400> 130 

Met val lie Asn Arg ser ser Ala ser Asp Gly Ala Tyr Ser Glu Lys 

1 5 10 15 

Gly Phe Tyr Leu Asp Gly Gly Val Glu Tyr Lys Tyr Ser val Phe Val 

20 25 30 

Lys His Asn Gly Thr Gly Thr Glu Thr phe Lys Leu Ser Val ser Tyr 

35 40 45 

Leu Asp ser Glu Thr Glu Glu Glu Asn Lys Glu Val lie Ala Thr Lys 

50 55 60 

Asp Val val Ala Gly Glu Trp Thr Glu lie ser Ala Lys Tyr Lys Ala 
65 70 75 80 

Pro Lys Thr Ala val Asn lie Thr Leu ser lie Thr Thr Asp ser Thr 

85 90 95 

Val Asp Phe lie Phe Asp Asp Val Thr lie Thr Arg Lys Gly Met Ala 

100 105 110 

Glu Ala Asn Thr Val Tyr Ala Ala Asn Ala Val Leu Lys Asp Met Tyr 

115 120 125 

Ala Asn Tyr Phe Arg val Gly ser val Leu Asn ser Gly Thr val Asn 

130 135 140 

Asn Ser Ser lie Lys Ala Leu lie Leu Arg Glu Phe Asn ser lie Thr 
145 150 155 160 

Cys Glu Asn Glu Met Lys Pro Asp Ala Thr Leu Val Gin Ser Gly Ser 

165 170 175 

Thr Asn Thr Asn lie Arg val ser Leu Asn Arg Ala Ala ser lie Leu 

180 185 ~ 190 

Asn Phe cys Ala Gin Asn Asn lie Ala val Arg Gly His Thr Leu val 

195 200 205 

Trp His Ser Gin Thr Pro Gin Trp Phe Phe Lys Asp Asn Phe Gin Asp 

210 215 220 

Asn Gly Asn Trp Val Ser Gin Ser Val Met Asp Gin Arg Leu Glu Ser 
225 230 235 ~ 240 

Tyr lie Lys Asn Met Phe Ala Glu lie Gin Arg Gin Tyr Pro Ser Leu 

245 250 255 

Asn Leu Tyr Ala Tyr Asp val Val Asn Glu Ala val Ser Asp Asp Ala 

260 265 270 

Asn Arg Thr Arg Tyr Tyr Gly Gly Ala Arg Glu Pro Gly Tyr Gly Asn 

275 280 ~ 285 

Gly Arg Ser Pro Trp val Gin He Tyr Gly Asp Asn Lys Phe lie Glu 

290 295 300 

Lys Ala Phe Thr Tyr Ala Arg Lys Tyr Ala Pro Ala Asn Cys Lys Leu 
305 310 315 320 

Tyr Tyr Asn Asp Tyr Asn Glu Tyr Trp Asp His Lys Arg Asp cys lie 

325 330 335 

Ala ser He Cys Ala Asn Leu Tyr Asn Lys Gly Leu Leu Asp Gly val 

340 345 350 

Gly Met Gin ser His lie Asn Ala Asp Met Asn Gly Phe Ser Gly lie 

355 360 365 

Gin Asn Tyr Lys Ala Ala Leu Gin Lys Tyr lie Asn lie Gly cys Asp 

370 375 380 

Val Gin lie Thr Glu Leu Asp lie Ser Thr Glu Asn Gly Lys Phe ser 
385 390 395 400 

Leu Gin Gin Gin Ala Asp Lys Tyr Lys Ala Val Phe Gin Ala Ala Val 

405 410 415 

Asp lie Asn Arg Thr ser ser Lys Gly Lys Val Thr Ala val Cys val 

420 425 430 

Trp Gly Pro Asn Asp Ala Asn Thr Trp Leu Gly ser Gin Asn Ala Pro 

435 440 445 

Leu Leu Phe Asn Ala Asn Asn Gin Pro Lys Pro Ala Tyr Asn Ala val 

450 455 460 

Ala ser He lie Pro Gin Ser Glu Trp Gly Asp Gly Asn Asn Pro Ala 
465 470 475 480 

Gly Gly Gly Gly Gly Gly Lys Pro Glu Glu Pro Asp Ala Asn Gly Tyr 

485 490 495 

Tyr Tyr His Asp Thr phe Glu Gly Ser val Gly Gin Trp Thr Ala Arg 

500 505 510 

Gly Pro Ala Glu val Leu Leu Ser Gly Arg Thr Ala Tyr Lys Gly Ser 

515 520 525 

Glu ser Leu Leu Val Arg Asn Arg Thr Ala Ala Trp Asn Gly Ala Gin 
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530 535 540 

Arg Ala Leu Asn Pro Arg Thr phe val Pro Gly Asn Thr Tyr Cys Phe 
545 550 555 560 

ser val Val Ala Ser Phe lie Glu Gly Ala Ser ser Thr Thr phe cys 

565 570 575 

Met Lys Leu Gin Tyr val Asp Gly Ser Gly Thr Gin Arg Tyr Asp Thr 

580 585 590 

lie Asp Met Lys Thr val Gly pro Asn Gin Trp val His Leu Tyr Asn 

595 600 605 

Pro Gin Tyr Arg lie Pro ser Asp Ala Thr Asp Met Tyr val Tyr Val 

610 615 620 

Glu Thr Ala Asp Asp Thr lie Asn Phe Tyr lie Asp Glu Ala lie Gly 
625 630 635 640 

Ala val Ala Gly Thr val He Glu Gly pro Ala Pro Gin Pro Thr Gin 

645 650 655 

Pro Pro Val Leu Leu Gly Asp Val Asn Gly Asp Gly Thr He Asn Ser 

660 665 670 

Thr Asp Leu Thr Met Leu Lys Arg Ser val Leu Arg Ala lie Thr Leu 

675 680 685 

Thr Asp Asp Ala Lys Ala Arg Ala Asp Val Asp Lys Asn Gly Ser lie 

690 695 700 

Asn Ser Thr Asp val Leu Leu Leu Ser Arg Tyr Leu Leu Arg val lie 
705 710 715 720 

Asp Lys Phe Pro Val Ala Glu Asn Pro Ser ser ser Phe Lys Tyr Glu 

725 730 735 

Ser Ala Val Gin Tyr Arg Pro Ala Pro Asp ser Tyr Leu Asn Pro Cys 

, , 740 745 750 

pro Gin Ala Gly Arg lie val Lys Glu Thr Tyr Thr Gly He Asn Gly 

755 760 765 

Thr Lys Ser Leu Asn val Tyr Leu Pro Tyr Gly Tyr Asp Pro Asn Lys 

770 775 780 

Lys Tyr Asn lie Phe Tyr Leu Met His Gly Gly Gly Glu Asn Glu Asn 
785 790 795 800 

Thr He Phe Ser Asn Asp Val Lys Leu Gin Asn lie Leu Asp His Ala 

805 810 815 

lie Met Asn Gly Glu Leu Glu Pro Leu lie Val Val Thr pro Thr Phe 

, , 820 825 830 

Asn Gly Gly Asn Cys Thr Ala Gin Asn Phe Tyr Gin Glu Phe Arq Gin 

835 840 845 

Asn Val lie Pro Phe Val Glu Ser Lys Tyr ser Thr Tyr Ala Glu ser 

850 855 860 

Thr Thr Pro Gin Gly lie Ala Ala Ser Arg Met His Arg Gly Phe Gly 
865 870 875 880 

Gly Phe ser Met Gly Gly Leu Thr Thr Trp Tyr val Met Val Asn Cys 

885 890 895 

Leu Asp Tyr Val Ala Tyr Phe Met Pro Leu ser Gly Asp Tyr Trp Tyr 

900 905 910 

Gly Asn Ser Pro Gin Asp Lys Ala Asn ser lie Ala Glu Ala lie Asn 

915 920 925 

Arg ser Gly Leu Ser Lys Arg Glu Tyr Phe Val Phe Ala Ala Thr Gly 

930 935 940 

Ser Glu Asp lie Ala Tyr Ala Asn Met Asn pro Gin lie Glu Ala Met 
945 950 955 960 

Lys Ala Leu Pro His Phe Asp Tyr Thr ser Asp Phe ser Lys Gly Asn 

965 970 ' 975 

Phe Tyr Phe Leu Val Ala Pro Gly Ala Thr His Trp Trp Gly Tyr val 

980 985 990 

Arg His Tyr lie Tyr Asp Ala Leu Pro Tyr Phe Phe His Glu 
995 1000 1005 

<210> 131 
<211> 1218 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 131 
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atgccgatca tccgaaccct atcgagttac atgcgaaatc atcaagcgat ctaccgtcag 60 

ctcctcacgc tggccgccgc cgtcacgctg gcgggcgcgg ccaccgcgga ggaagaagcc 120 

accctgcgcg gggtttacga aaaggacttc accatcggcg tggccatgaa cgggggccag 180 

gcctccggcc gcaatgccgc cgccggcgag atcatcggca agcagttctc ctcgctcacc 240 

gcggagaacg acatgaagtg gcagatgatc cacccccagg agggtcaata ccgcttcgaa 300 

acgtccgacg cctacgtcgc gttcgcggaa aagcacaaga tggaagtcat cggccacacc 360 

ctcgtgtggc acagccagac cccgcagtgg gtcttccagg gtgaaaacgg ccagcccgcc 420 

accaaggaag agctgctcaa gcgcatgcgc gaccacatcc acgccgtggc cggccgttac 480 

aagggcaaga tcaagggctg ggacgtcgtc aacgaagcgc tctccgacgg cggggacgac 540 

attctccgcc agtccccctg gcgccgcatc atcggcgacg acttcatcga ctacgccttc 600 

cgctacgcca aggaagccgc cccggatgcc gagctctact acaacgacta caacctcgag 660 

atcccccgca agcgcgccaa ttgcatcacg ctggtcaagg gcatgctcga gcgcggcgtg 720 

ccgatcgacg gcatcggcac ccagtcgcac ttccagctcg gctttccctc cttggacgac 780 

gtggaagcca ccatcaagga attcgccgcc ctgggcatga aggtgatgat caccgagctc 840 

gacgtggatg tcctgccccg caacaacccc ggggtcgccg acatcgccaa ccgcgaacag 900 

ggagccaacc cctacaccga aggccttccg gacgacgtgc aggaaaagct cgcgaagcgc 960 

tacgaggaca tcttccgcat ctacctgaag taccgcgacc acgtcacccg cgtcaccttc 1020 

tggggcctgg atgacggcat gacctggctg aacggcttcc cggtccgcgg ccgcaccaac 1080 

caccccctgc tctacgaccg gcagctcaat gccaagcccg ccttccacgc cctcgtcaag 1140 

ctgggtcagg aagagcgtcc ggaagccgcc aaggtcgagg tccagaagat cgaagcgaag 1200 

aaagaagagg cgaagtaa 1218 

<210> 132 
<211> 405 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(26) 

<40Q> 132 

Met Pro lie He Arg Thr Leu Ser ser Tyr Met Arg Asn His Gin Ala 

15 10 15 

lie Tyr Arg Gin Leu Leu Thr Leu Ala Ala Ala Val Thr Leu Ala Gly 

20 25 30 

Ala Ala Thr Ala Glu Glu Glu Ala Thr Leu Arg Gly Val Tyr Glu Lys 

35 40 45 

Asp Phe Thr lie Gly Val Ala Met Asn Gly Gly Gin Ala Ser Gly Arg 

50 55 60 

Asn Ala Ala Ala Gly Glu He lie Gly Lys Gin Phe ser ser Leu Thr 
65 70 75 80 

Ala Glu Asn Asp Met Lys Trp Gin Met lie His Pro Gin Glu Gly Gin 

85 90 95 

Tyr Arg Phe Glu Thr ser Asp Ala Tyr Val Ala Phe Ala Glu Lys His 

100 105 110 

Lys Met Glu Val lie Gly His Thr Leu val Trp His Ser Gin Thr Pro 

115 120 125 

Gin Trp Val Phe Gin Gly Glu Asn Gly Gin Pro Ala Thr Lys Glu Glu 

130 135 140 

Leu Leu Lys Arg Met Arg Asp His lie His Ala Val Ala Gly Arg Tyr 
145 150 155 160 

Lys Gly Lys lie Lys Gly Trp Asp val val Asn Glu Ala Leu Ser Asp 

165 170 175 

Gly Gly Asp Asp He Leu Arg Gin ser pro Trp Arg Arg lie lie Gly 

180 185 190 

Asp Asp Phe lie Asp Tyr Ala Phe Arg Tyr Ala Lys Glu Ala Ala Pro 

195 200 205 

Asp Ala Glu Leu Tyr Tyr Asn Asp Tyr Asn Leu Glu lie Pro Arg Lys 

210 215 220 

Arg Ala Asn Cys lie Thr Leu Val Lys Gly Met Leu Glu Arg Gly Val 
225 230 235 240 

Pro lie Asp Gly lie Gly Thr Gin ser His Phe Gin Leu Gly Phe Pro 

245 250 255 

Ser Leu Asp Asp Val Glu Ala Thr lie Lys Glu Phe Ala Ala Leu Gly 

260 265 270 

Met Lys val Met lie Thr Glu Leu Asp Val Asp Val Leu Pro Arg Asn 
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275 280 285 

Asn Pro Gly Val Ala Asp lie Ala Asn Arg Glu Gin Gly Ala Asn Pro 

290 295 300 

Tyr Thr Glu Gly Leu Pro Asp Asp Val Gin Glu Lys Leu Ala Lys Arg 
305 310 315 320 

Tyr Glu Asp lie Phe Arg lie Tyr Leu Lys Tyr Arg Asp His val Thr 

325 330 335 

Arg Val Thr Phe Trp Gly Leu Asp Asp Gly Met Thr Trp Leu Asn Gly 

340 345 350 

Phe Pro Val Arg Gly Arg Thr Asn His Pro Leu Leu Tyr Asp Arg Gin 

355 360 365 

Leu Asn Ala Lys Pro Ala Phe His Ala Leu Val Lys Leu Gly Gin Glu 

370 375 380 

Glu Arg Pro Glu Ala Ala Lys Val Glu val Gin Lys lie Glu Ala Lys 
385 390 395 400 

Lys Glu Glu Ala Lys 
405 

<210> 133 
<211> 1011 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an envi roninental sample 
<400> 133 

atgaaaaata atcaatttag gaaaatccct tccctacata aggtatataa gagtcatttt 60 

ttaattgggg cagctgtaaa tccacttaca cttcaaacac aacaggaact aatcaaaaag 120 

cactttaata gtattacggc agaaaatgaa atgaaatttg aagagttgca acctgagcct 180 

ggacatttta catttgatgt aggagataaa atggtcgctt tcgcaaaaga aaatggtatg 240 

aaagttagag gtcatacatt aatctggcac aatcaaacac ctgattggat gtttaagaat 300 

gaagatggtt ctgtcacaga tcgagataca cttcttgaaa gaatgaaatt acatattaca 360 

actgttatgg agcattataa ggggcaaatt tattgttggg atgttgtcaa tgaagcgatt 420 

gctgatgaag gatcagagtt attacgtcac tctaaatgga ctgaaattat tggcgacgat 480 

tttattgaaa aggcatttga gtatgcacat gaagcagacc cagaagcttt actattctat 540 

aatgactata atgagtccca ccctcataag cgagataaaa tttacacact aataaaaaga 600 

ttggtagaca aaggcatacc tattcacggg gttggcttgc aagcacattg gaatttaaca 660 

gacccttctt atgaggagat tagggctgca attgaaaaat atgcctcatt aggcttggaa 720 

atacatctta cagaaatgga tgtttcagtg ttcaattttg aagatcgaag aacagactta 780 

acagagccga ctaatgaaat gaagactctt caagtagaac gttatacgga atttttcaaa 840 

atacttagag aatatagcca tgtgattagc tctgtcactt tttggggagc tgcagatgat 900 

tatacttggt tggatgggtt tccagttaga ggaaggaaaa actggccatt tgtttttgac 960 

gaaaaccacc aaccgaaaga atctttctgg ggaattgtcg attttgaata a 1011 

<210> 134 
<211> 336 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 134 

Met Lys Asn Asn Gin Phe Arg Lys lie Pro Ser Leu His Lys Val Tyr 

1 ■ ■ 5 10 15 

Lys ser His Phe Leu lie Gly Ala Ala val Asn Pro Leu Thr Leu Gin 

20 25 30 

Thr Gin Gin Glu Leu lie Lys Lys His Phe Asn ser lie Thr Ala Glu 

35 40 45 

Asn Glu Met Lys Phe Glu Glu Leu Gin Pro Glu Pro Gly His Phe Thr 

50 55 60 

Phe Asp Val Gly Asp Lys Met val Ala Phe Ala Lys Glu Asn Gly Met 

65 . 70 75 80 

Lys val Arg Gly His Thr Leu lie Trp His Asn Gin Thr Pro Asp Trp 

85 90 95 

Met Phe Lys Asn Glu Asp Gly Ser Val Thr Asp Arg Asp Thr Leu Leu 

100 105 110 

Glu Arg Met Lys Leu His He Thr Thr Val Met Glu His Tyr Lys Gly 

Page 108 



WO 03/106654 



PCT/US03/19153 



115 120 125 

Gin lie Tyr cys Trp Asp val val Asn Glu Ala lie Ala Asp Glu Gly 

130 135 140 

Ser Glu Leu Leu Arg His Ser Lys Trp Thr Glu lie He Gly Asp Asp 
145 150 155 160 

Phe lie Glu Lys Ala Phe Glu Tyr Ala His Glu Ala Asp Pro Glu Ala 

165 170 175 

Leu Leu Phe Tyr Asn Asp Tyr Asn Glu ser His Pro His Lys Arg Asp 

180 * 185 190 

Lys lie Tyr Thr Leu lie Lys Arg Leu val Asp Lys Gly lie Pro lie 

195 200 205 

His Gly Val Gly Leu Gin Ala His Trp Asn Leu Thr Asp Pro ser Tyr 

210 215 220 

Glu Glu He Arg Ala Ala lie Glu Lys Tyr Ala Ser Leu Gly Leu Glu 
225 230 235 240 

lie His Leu Thr Glu Met Asp val ser val Phe Asn Phe Glu Asp Arg 

245 250 255 

Arg Thr Asp Leu Thr Glu pro Thr Asn Glu Met Lys Thr Leu Gin Val 

260 265 270 

Glu Arg Tyr Thr Glu Phe Phe Lys lie Leu Arg Glu Tyr Ser His Val 

275 280 285 

lie ser ser val Thr Phe Trp Gly Ala Ala Asp Asp Tyr Thr Trp Leu 

290 295 300 

Asp Gly Phe Pro Val Arg Gly Arg Lys Asn Trp Pro Phe val Phe Asp 
305 310 315 320 

Glu Asn His Gin Pro Lys Glu Ser Phe Trp Gly lie Val Asp phe Glu 
325 330 335 

<210> 135 
<211> 1170 
<212> DNA 
<213> Unknown 



<220> 

<223> obtained from an environmental sample 



<400> 135 

atgcgacgcc 

gagaccgttg 

gattttcgtg 

ctcgaggttg 

gaggtccacc 

ggcgaaaaga 

gactgggcct 

atcaaggaac 

gtcgtgaacg 

ggtcagcgcg 

gaattcgcgc 

cacccgaaaa 

cgtatcgatg 

atcgatcaca 

gacatcaaca 

tacgagctca 

gcactcgcgg 

gaccgcgtca 

cccggtcgca 

gatgccgtca 



tcatcgccct 
cggccgaatc 
tcggcgctgc 
tcgcccagca 
cagaagcaga 
acaacatgtt 
ttgagggcaa 
acattgaaac 
aggcaatcga 
gcgaaccgtg 
acaccgccga 
agatcgaagc 
gcctcggtct 
tgctaaccga 
tgcttccgca 
gaaaggagct 
cgcgttatgc 
cattttgggg 
ctgcctaccc 
tcggagtcgc 



tgtcctatat 
gaaacagccg 
aattggcacc 
gttcaacaca 
ccgctacaac 
catcgtcggc 
ggacggcaag 
cgtggtcggc 
cgacaacggc 
gcacgccgcc 
ccccgacgct 
catctcgcag 
ccagggccat 
gtatggcaag 
gcccgacccg 
cgatccgtat 
tgaaatcttc 
cgttcacgac 
gcttctcttc 
agagcaatga 



ataggaaccg 
aaagctagcc 
aatcaggtca 
atcacgcctg 
ttcgaaccgt 
cacacgctcg 
ccgctcgatc 
cgatatcgcg 
aaacttcgta 
atcggagacg 
gaactctatt 
ctggtgcggt 
tgggggatgg 
ctcggcgtga 
agtcaacgcg 
tccgacggac 
gaagtcttcg 
ggccattcat 
gacacgaagc 



ccgcgagcgg 
taaagaatgc 
tgggcgagga 
agaatctcct 
ccgatcgctt 
tgtggcataa 
gcgaaacagc 
gccgcatcca 
Qtgggccggt 
actacatcca 
acaacgacta 
cgctcaaaga 
attacccgaa 
agctcatgat 
gcgccgatat 
tcccgcccga 
ctaagcatcg 
ggctcaacaa 
ttcagcccaa 



gacctccgtg 
gttcgcagac 
gccaaaatcg 
caaatgggct 
cgtcgaattt 
ccaaacgccg 
gctcgcccga 
tgcttgggac 
cggagtgccc 
gaaggcgttc 
caacgaatgg 
gaagggcgtt 
agtcgaagag 
taccgaactc 
cactcgcaac 
tatgcaaaag 
cgataagctc 
ctggcctgtt 
gccggcattt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1170 



<210> 136 
<211> 389 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(25) 



page 109 



WO 03/106654 



PCT7US03/19153 



<400> 136 

Met Arg Arg Leu lie Ala Leu vial Leu Tyr lie Gly Thr Ala Ala Ser 

15 10 15 

Gly Thr ser Val Glu Thr Val Ala Ala Glu ser Lys Gin Pro Lys Ala 

20 25 30 

ser Leu Lys Asn Ala Phe Ala Asp Asp Phe Arg val Gly Ala Ala lie 

35 40 45 

Gly Thr Asn Gin Val Met Gly Glu Glu Pro Lys ser Leu Glu Val Val 

50 55 60 

Ala Gin Gin Phe Asn Thr lie Thr Pro Glu Asn Leu Leu Lys Trp Ala 
65 70 75 80 

Glu val His Pro Glu Ala Asp Arg Tyr Asn Phe Glu Pro ser Asp Arg 

85 90 95 

Phe val Glu Phe Gly Glu Lys Asn Asn Met Phe lie Val Gly His Thr 

100 * 105 110 

Leu Val Trp His Asn Gin Thr Pro Asp Trp Ala Phe Glu Gly Lys Asp 

115 120 125 

Gly Lys Pro Leu Asp Arg Glu Thr Ala Leu Ala Arg lie Lys Glu His 

130 135 140 

He Glu Thr val Val Gly Arg Tyr Arg Gly Arg He His Ala Trp Asp 
145 150 155 160 

Val val Asn Glu Ala lie Asp Asp Asn Gly Lys Leu Arg ser Gly Pro 

165 170 " 175 

Val Gly val Pro Gly Gin Arg Gly Glu Pro Trp His Ala Ala lie Gly 

180 185 190 

Asp Asp Tyr He Gin Lys Ala Phe Glu Phe Ala His Thr Ala Asp Pro 

195 200 205 

Asp Ala Glu Leu Tyr Tyr Asn Asp Tyr Asn Glu Trp His pro Lys Lys 

210 215 220 

lie Glu Ala lie Ser Gin Leu val Arg Ser Leu Lys Glu Lys Gly val 
225 230 235 240 

Arg lie Asp Gly Leu Gly Leu Gin Gly His Trp Gly Met Asp Tyr Pro 

245 250 255 

Lys Val Glu Glu He Asp His Met Leu Thr Glu Tyr Gly Lys Leu Gly 

260 265 270 

val Lys Leu Met lie Thr Glu Leu Asp lie Asn Met Leu Pro Gin Pro 

275 280 285 

Asp Pro ser Gin Arg Gly Ala Asp lie Thr Arg Asn Tyr Glu Leu Arg 

290 295 300 

Lys Glu Leu Asp Pro Tyr Ser Asp Gly Leu Pro Pro Asp Met Gin Lys 
305 310 315 320 

Ala Leu Ala Ala Arg Tyr Ala Glu lie Phe Glu Val Phe Ala Lys His 

325 330 335 

Arg Asp Lys Leu Asp Arg val Thr Phe Trp Gly val His Asp Gly His 

340 ^ 345 350 

Ser Trp Leu Asn Asn Trp Pro Val Pro Gly Arg Thr Ala Tyr Pro Leu 

355 360 365 

Leu Phe Asp Thr Lys Leu Gin Pro Lys Pro Ala Phe Asp Ala Val lie 

370 375 380 

Gly Val Ala Glu Gin 
385 

<210> 137 
<211> 1044 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 137 

gtggatcctt cgctgaagga agcagcttcg ggcaagtttc tgatgggggt agcgttgaat 60 

gtacgtcagg cagcaggtca ggatacttgc gcctcgaaag tggtaaaacg tcattttaat 120 

tccattgtgg ccgagaattg catgaaatgc gaagtgattc atccggagga agaccatttt 180 

gattttacgg aagcggaccg gttggttcgt tttggcgagg agaacgatat ggctgttatc 240 

gggcattgcc ttatctggca ttcacagctg gcaccttggt tctgtgtgga caaacaagga 300 

aaaacagtaa gtgccgacat cttgaaggag cgtataaaaa aacatatcca gactattgtg 360 

acgcactata aagggcgtat aaagggctgg gatgtgttga atgaagccat tgaatcggac 420 

ggctcctggc gtaaatctcc tttttacgag atattaggcg aagagtacat cccgcttatt 480 
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tttcagtatg ctcatgaggc agatccggaa gccgaacttt actataatga ttatggcatg 540 

gacgggaagg ctaagcgtga caaagtagtc gaattggtaa agatgctgaa agatcgtgga 600 

ctgcgcatcg acgcggtagg tatgcaggga cacatgggaa tggattatcc gtcagtgtcc 660 

gaatttgaag ccagtatact ggcatttgca gctgccggag taaaggtgat ggtaaccgaa 720 

tgggatatga gtgcattgcc cacgacacgg atgggagcca atatttcgga cacggtgtct 780 

tataaacaat ccctgaatcc ctatcccgac ggtttgcccg actctgtgtc tgtggcatgg 840 

aataaccgga tgaaggaatt tttcggtctt ttcctgaaac attcgaatat cattacccgt 900 

gtgacggcgt ggggggtgac ggacggtgac tcatggaaga ataatttccc tgtgcccgga 960 

cgtgtggatt atcctttatt gttcgaccgt gattgccggc cgaaaccttt tgtggaagaa 1020 

ctgattggaa aacagaacat ttaa 1044 

<210> 138 
<211> 347 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 138 

Val Asp Pro ser Leu Lys Glu Ala Ala ser Gly Lys Phe Leu Met Gly 

1 5 10 15 

Val Ala Leu Asn val Arg Gin Ala Ala Gly Gin Asp Thr Cys Ala Ser 

20 25 30 

Lys Val Val Lys Arg His Phe Asn Ser lie Val Ala Glu Asn cys Met 

35 40 45 

Lys cys Glu Val lie His Pro Glu Glu Asp His Phe Asp Phe Thr Glu 

50 55 60 

Ala Asp Arg Leu val Arg Phe Gly Glu Glu Asn Asp Met Ala Val lie 
65 70 75 80 

Gly His Cys Leu lie Trp His Ser Gin Leu Ala Pro Trp Phe Cys Val 

85 90 95 

Asp Lys Gin Gly Lys Thr val Ser Ala Asp lie Leu Lys Glu Arg lie 

100 105 110 

Lys Lys His lie Gin Thr lie Val Thr His Tyr Lys Gly Arg lie Lys 

115 120 125 

Gly Trp Asp val Leu Asn Glu Ala lie Glu ser Asp Gly Ser Trp Arg 

130 135 140 

Lys Ser Pro Phe Tyr Glu He Leu Gly Glu Glu Tyr He Pro Leu He 
145 150 155 160 

Phe Gin Tyr Ala His Glu Ala Asp Pro Glu Ala Glu Leu Tyr Tyr Asn 

165 170 175 

Asp Tyr Gly Met Asp Gly Lys Ala Lys Arg Asp Lys val val Glu Leu 

180 185 190 

val Lys Met Leu Lys Asp Arg Gly Leu Arg He Asp Ala Val Gly Met 

195 200 205 

Gin Gly His Met Gly Met Asp Tyr Pro Ser Val Ser Glu Phe Glu Ala 

210 215 220 

ser lie Leu Ala Phe Ala Ala Ala Gly val Lys val Met val Thr Glu 
225 230 235 240 

Trp Asp Met Ser Ala Leu Pro Thr Thr Arg Met Gly Ala Asn lie Ser 

245 250 255 

Asp Thr val Ser Tyr Lys Gin Ser Leu Asn Pro Tyr Pro Asp Gly Leu 

260 265 270 

Pro Asp ser val ser val Ala Trp Asn Asn Arg Met Lys Glu Phe Phe 

275 280 285 

Gly Leu Phe Leu Lys His ser Asn lie lie Thr Arg Val Thr Ala Trp 

290 295 300 

Gly Val Thr Asp Gly Asp ser Trp Lys Asn Asn Phe Pro val Pro Gly 
305 310 315 320 

Arg val Asp Tyr Pro Leu Leu Phe Asp Arg Asp cys Arg Pro Lys Pro 

* 325 330 335 

Phe Val Glu Glu Leu lie Gly Lys Gin Asn He 
340 345 

<210> 139 
<211> 1143 
<212> DNA 
<213> unknown 
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<220> 

<223> Obtained from an environmental sample 
<400> 139 

atgaaaaaaa cgattgcaca tttcacctta tggatagtgt tttttctctt cacttcctgt 60 

actgttacgg cgcagaagaa tgctaagaat gcaagagtaa aacccactac cctaaaagag 120 

gcttaccaag gtaaattcta tatcggtact gcgatgaact tgagacagat tcacggagat 180 

gatccccaat ctgaaaatat tatcaaaaaa cagttcaatt ccatagttgc cgaaaactgc 240 

atgaagagta tgtatcttca gccggaggaa ggaaaatttt tcttcgatga tgcggacaag 300 

tttgtggatt ttggtcttca gaacaatatg ttcattatcg ggcattgtct gatttggcat 360 

tcgcaggcgc caaaatggtt tttcaccgac gaaaatggaa acacggtttc tccagaagtt 420 

cttaaacaaa ggatgaaagc ccatatcacc gctgtcgttt cccgctacaa agggaaaatc 480 

aaaggttggg atgtggtgaa cgaagccatt atggaagatg gttcttaccg caaaagcaaa 540 

ttttacgaga ttttgggaga agaatttatt ccgttggcat ttcagtatgc gcatgaagca 600 

gatcctgatg cagaacttta ttacaacgat tataacgaat ggtatcccgg gaaaagagct 660 

atggtgacca aaataatccg cgatttcaaa actagaggaa tccgcatcga tgccatcgga 720 

atgcaggctc atttcgggat ggattcgccc actgtagaag agtatgaaca aactattcag 780 

ggctatataa aagaaggcgt gaaagtcaat attacggaac tcgatttaag tccgcttcct 840 

tctccttggg gaacttccgc caacgttgct gatacgcagc agtatcagga aaaaatgaat 900 

ccttacacca aaggacttcc tgtcgatgta gaaaaagcat gggaaaaccg ttatctcgat 960 

tttttcaaac ttttcctaaa atatcatcag catattgagc gtgtaacttt ttggggagtg 1020 

agcgacatcg attcctggaa aaacgatttt ccgataagag gacgtaccga ttatccacta 1080 

ccgtttaacc gtcaatatca ggcaaaacct ttggttcaga aattaataga cttaacgaaa 1140 

tag ~ ~ 1143 

<210> 140 

<211> 380 

<212> PRT 

<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(24) 

<400> 140 

Met Lys Lys Thr lie Ala His Phe Thr Leu Trp lie val Phe Phe Leu 
1 5 10 15 

Phe Thr ser Cys Thr val Thr Ala Gin Lys Asn Ala Lys Asn Ala Arg 

20 25 30 

Val Lys Pro Thr Thr Leu Lys Glu Ala Tyr Gin Gly Lys Phe Tyr lie 

35 40 45 

Gly Thr Ala Met Asn Leu Arg Gin lie His Gly Asp Asp Pro Gin Ser 

50 55 60 

Glu Asn lie lie Lys Lys Gin Phe Asn ser lie val Ala Glu Asn Cys 
65 70 75 80 

Met Lys Ser Met Tyr Leu Gin Pro Glu Glu Gly Lys Phe Phe Phe Asp 

85 90 95 

Asp Ala Asp Lys Phe Val Asp Phe Gly Leu Gin Asn Asn Met Phe lie 

100 105 110 

lie Gly His cys Leu lie Trp His Ser Gin Ala Pro Lys Trp Phe Phe 

115 120 125 

Thr Asp Glu Asn Gly Asn Thr val Ser Pro Glu val Leu Lys Gin Arg 

130 135 140 

Met Lys Ala His lie Thr Ala val Val Ser Arg Tyr Lys Gly Lys lie 
145 150 155 160 

Lys Gly Trp Asp Val Val Asn Glu Ala He Met Glu Asp Gly Ser Tyr 

165 170 175 

Arg Lys ser Lys Phe Tyr Glu He Leu Gly Glu Glu Phe He Pro Leu 

180 185 190 

Ala Phe Gin Tyr Ala His Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr 

195 200 205 

Asn Asp Tyr Asn Glu Trp Tyr Pro Gly Lys Arg Ala Met val Thr Lys 

210 215 220 

lie lie Arg Asp Phe Lys Thr Arg Gly lie Arg He Asp Ala lie Gly 
225 230 235 240 

Met Gin Ala His Phe Gly Met Asp ser Pro Thr val Glu Glu Tyr Glu 
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245 250 255 

Gin Thr lie Gin Gly Tyr lie Lys Glu Gly val Lys Val Asn lie Thr 

260 265 270 

Glu Leu Asp Leu Ser Pro Leu Pro ser Pro Trp Gly Thr Ser Ala Asn 

275 280 285 

Val Ala Asp Thr Gin Gin Tyr Gin Glu Lys Met Asn Pro Tyr Thr Lys 

290 295 300 

Gly Leu pro Val Asp Val Glu Lys Ala Trp Glu Asn Arg Tyr Leu Asp 
305 310 315 320 

Phe Phe Lys Leu Phe Leu Lys Tyr His Gin His lie Glu Arg Val Thr 

, 325 330 335 

Phe Trp Gly Val ser Asp lie Asp Ser Trp Lys Asn Asp Phe Pro lie 

340 345 350 

Arg Gly Arg Thr Asp Tyr Pro Leu Pro Phe Asn Arg Gin Tyr Gin Ala 

355 360 365 

Lys Pro Leu Val Gin Lys Leu lie Asp Leu Thr Lys 
370 375 380 

<210> 141 
<211> 1134 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 141 

atgaatatct cacgcagaca actactggcg ctcacgggtg ctacggcggc gatcacagca 60 

gccaaattac aggcggcaga aaaagccagc gccgcgaccg gcttgcgcga tgcctacaaa 120 

aatgattttt tgattggcgc tgcgctgagt gcatcgatca ttcaacagca agatccacag 180 

ctagttgcac tgattaataa agactttaat tccatcaccc cagaaaactg tatgaaatgg 240 

ggcgagatgc gcaatgatga cggcagctgg aagtggcagg atgcagacgc atttgtcgag 300 

tatggaagca aatacaaact acatatggtc ggccacacat tggggtggca cagccagatt 360 

cccgatagcg tgtttaaaaa taaagacggt agctatattt ccaaaaccga actcgcgaaa 420 

aaacaaaaag aacacatcac cactattgtt ggccgctaca aaggcaaact tgccgcgtgg 480 

gatgtggtga atgaagctgt cggcgatgac aacaaaatgc gcgatagtca ctggtataaa 540 

atcatgggcg atgattttct cgttaatgca tttaaccttg ctcatgaagt agatccgaag 600 

gcgcatctga tgtacaacga ctacaacaac gagcgcccgg aaaaacgcca ggcgactatc 660 

gatatgatca agcgtctgca acaacgcggt acaccaatcc atggtttggg catgcaagcg 720 

catatcggat tggaaaccaa tatgcaggat tttgaagata gtattctcgc ctattcagca 780 

ttgggtttaa aaatccatct caccgaacta gatatagatg tgctgccctc tgtatggaat 840 

ttaccggtgg ccgaaatttc tacccgcttt gaatacaagc cggaacgcga tccttataca 900 

aaaggtttgc cgaaagagat tgatgaaaaa cttgcaaaag cctatgaatc gctatttaaa 960 

atattgctta aacatcgcga caaaatagat agagttacgt tttggggcgt aagcgatgat 1020 

gccagctggc tcaatgattt cccaatcaat ggcagaacca actatccgtt attgtttaac 1080 

cgtcaacgcc aacctaaagc tgcttatttc cgtttgctgg atttaaaacg ctag 1134 

<210> 142 

<211> 377 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(25) 

<400> 142 

Met Asn lie ser Arg Arg Gin Leu Leu Ala Leu Thr Gly Ala Thr Ala 

1 , 5 10 15 

Ala He Thr Ala Ala Lys Leu Gin Ala Ala Glu Lys Ala ser Ala Ala 

20 25 30 

Thr Gly Leu Arg Asp Ala Tyr Lys Asn Asp Phe Leu lie Gly Ala Ala 

35 40 45 

Leu ser Ala ser He lie Gin Gin Gin Asp Pro Gin Leu val Ala Leu 

50 55 60 

lie Asn Lys Asp Phe Asn ser lie Thr Pro Glu Asn Cys Met Lys Trp 
65 70 75 80 

Page 113 



WO 03/106654 



PCT/US03/19153 



Gly Glu Met Arg Asn Asp Asp Gly Ser Trp Lys Trp Gin Asp Ala Asp 

85 90 95 

Ala Phe val Glu Tyr Gly ser Lys Tyr Lys Leu His Met Val Gly His 

100 105 110 

Thr Leu Gly Trp His ser Gin lie Pro Asp ser Val Phe Lys Asn Lys 

115 120 125 

Asp Gly ser Tyr lie ser Lys Thr Glu Leu Ala Lys Lys Gin Lys Glu 

K 130 135 140 

His He Thr Thr lie Val Gly Arg Tyr Lys Gly Lys Leu Ala Ala Trp 
145 150 155 160 

Asp Val Val Asn Glu Ala val Gly Asp Asp Asn Lys Met Arg Asp Ser 

165 170 175 

His Trp Tyr Lys lie Met Gly Asp Asp Phe Leu Val Asn Ala Phe Asn 

180 185 190 

Leu Ala His Glu Val Asp Pro Lys Ala His Leu Met Tyr Asn Asp Tyr 

195 200 205 

Asn Asn Glu Arg Pro Glu Lys Arg Gin Ala Thr He Asp Met He Lys 

210 215 220 

Arq Leu Gin Gin Arg Gly Thr Pro lie His Gly Leu Gly Met Gin Ala 
225 236 235 240 

His lie Gly Leu Glu Thr Asn Met Gin Asp Phe Glu Asp ser lie Leu 

245 250 255 

Ala Tyr Ser Ala Leu Gly Leu Lys lie His Leu Thr Glu Leu Asp lie 

260 265 270 

Asp Val Leu Pro Ser Val Trp Asn Leu Pro val Ala Glu lie ser Thr 

275 280 285 

Arq phe Glu Tyr Lys Pro Glu Arg Asp Pro Tyr Thr Lys Gly Leu Pro 

290 295 300 

Lys Glu lie Asp Glu Lys Leu Ala Lys Ala Tyr Glu Ser Leu Phe Lys 
305 310 315 320 

lie Leu Leu Lys His Arg Asp Lys lie Asp Arg Val Thr Phe Trp Gly 

325 330 335 

val ser Asp Asp Ala ser Trp Leu Asn Asp Phe Pro lie Asn Gly Arg 

340 345 350 

Thr Asn Tyr Pro Leu Leu Phe Asn Arg Gin Arg Gin Pro Lys Ala Ala 

355 360 365 

Tyr Phe Arg Leu Leu Asp Leu Lys Arg 
370 375 

<210> 143 
<211> 3285 
<212> DNA 
<213> Bacteria 

<400> 143 

atgagtttaa aaataaataa aatcatatca tttatcatag ttttttcgat ggtttttggg 60 

acgttaatgt atgtgccaca tctaaaagca tttgcggata ataccggtat taatttggtt 120 

tctaatggtg attttgaatc aggcacaatt gatggctggt ttaaacaagg taatccgaca 180 

ttaacagtaa caactgagca ggcaattggg caatacagta tgaaagttac aggtagaaca 240 

cagacatatg aaggacccgc atatagcttt ttggggaaaa tgcagaaagg tgaatcatat 300 

aacgtatcac ttaaagttag acttgtttct ggacaaaatt catctaatcc tttgatcact 360 

gtaactatgt ttagagaaga tgacaatggc aatcattatg acacaatagt ttggcaaaaa 420 

caagtttctg aagattcatg gactactgta agtgggactt atacattgga ttatactgga 480 

acattaaaaa cattatatat gtatgtagaa tcacccgatc caacgcttga atattatatt 540 

gatgatgttg tagtcacacc gcaaaatcca acgcaaatag gaaatgtagt tgccaatgga 600 

acttttgaaa atgaaaatac ttctggatgg gttggaacag gttcatctgt tgttaaagca 660 

gtatatggtg atgctcacag cggagattat agcttattga cgacaggaag gacagctaac 720 

tggaatggtc ctagttatga tttgactggc aaaatagttc ccggacaaca atacaatgtg 780 

gatttttggg taaaatttat tgatggcaat gatacagagc aaatcaaggc tactgttaaa 840 

gcgacttctg acaaagacaa ttatatacaa gttaatgatt ttgcagatgt aagtaaaggt 900 

gaatggacag aaataaaagg cagttttact ttacctgttg cagattacag cggcattagc 960 

atctatgtgg aatctcaaaa tcctacttta gagttttaca ttgatgattt ttctgtaata 1020 

ggtgaaattg caaataatca gattactatt caaaatgaca ttccagattt gtactctgta 1080 

tttaaagatt attttcctat aggcgttgcg gttgatccaa gtagattaaa tgatactgat 1140 

ccgcatgctc aattgacggc taaacatttt aatatgcttg ttgcagaaaa cgccatgaaa 1200 

cctgaaagtt tacaacccac agaaggaaat tttacttttg ataatgctga taagattgtt 1260 

gattatgcaa tatcacataa tatgaagatg agggggcata ctttactttg gcataatcaa 1320 

gttccagatt ggtttttcca agatccgtct gacccatcca agcctgcttc gagagattta 1380 

ctattacaaa gattaaaaac tcatattaca actgtgttag accattttaa aacaaagtat 1440 
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ggttctcaga atccaataat tggatgggat gtcgtaaatg aagttcttga tgataatggc 1500 

agtttgagaa attcgaagtg gttgcaaatt attggacccg actatataga aaaagccttt 1560 

gaatatgcac atgaagcgga tccatcgatg aagttgttta ttaatgatta caatatcgaa 1620 

aataatggcg ttaaaactca agctatgtat gacttggtaa aaagattaaa gagtgaaggc 1680 

gttcctatag atggaatagg gatgcaaatg cacataaata taaattccaa tatagataat 1740 

ataaaagcat caatagaaaa actggcatcg ttaggcgttg aaatacaagt aactgaatta 1800 

gatatgaaca tgaacggtaa tatatctaac gaagcattgc tcaagcaagc tagattgtat 1860 

aaacaattat ttgacttatt taaagcagag aaacaatata taactgctgt agttttttgg 1920 

ggagtttcag acgatgtaac ttggcttagc aagccaaatg ctccgctact ttttgataca 1980 

aagttgcaag caaagccagc atactgggca atagtagatc cgaataaagc tacaccagac 2040 

attcaatctg caaaggcttt ggaaggatca ccgacaatgg gtacaaatgt tgataactct 2100 

tggaaacttg taaagccgtt atatgcaaat acttttgtag aagggtcggt cggagcaact 2160 

gctgctgtta agtctatgtg ggatactaaa aacttgtatt tgttagtaca agtttcagac 2220 

aataccccat ctagtaatga tggtattgag atttttgtag ataagaatga tgacaaatcc 2280 

acttcttatg aaactgatga tgaacattat acaattaaga gggatggtac agggagttca 2340 

gatattacca aatatgtgac ttctaatgct gacggatatg tagcacagct agctattcca 2400 

attgaagata ttaatcctgc acttaatgat aaaattggat ttgacattag aataaatgat 2460 

gataaaggta ttggtaatat agatgcaata acagtttgga acgattatac aaacagtcaa 2520 

gatactaata catcgtattt tggcgattta gtattatcaa aacctgcaca aattgcaaca 2580 

gctatatatg gcactcctgt tattgatggt aaagtagatg atatttggaa taatgttgaa 2640 

gctatttcaa caaatacatg ggttttgggt tcaaatggtg ctactgcgac agcaaaaatg 2700 

atgtgggacg ataagtacct ttatgttttg gcggatgtta cagattcaaa tctgaacaaa 2760 

tctagtgtta atccatatga acaagattct gtagaagttt ttgtagatca aaataatgac 2820 

aagacgacat attataaaaa tgatgatgga cagtttagag ttaactatga caatgaacaa 2880 

agctttgggg gaagcactaa ttcaaatgga tttaaatcgg caactagtct tacacaaagt 2940 

ggatatattg tagaagaagc tattccttgg acgagtatca ctccatcaaa tggcactatc 3000 

ataggatttg acttgcaagt taatgatgca gatgaaaatg gtaagaggac aggtattgta 3060 

acatggtgtg atccgagcgg aaattcatgg caagatactt ctgggtttgg gaatttattg 3120 

cttacaggta aaccatccgg tgttggtaca aaaagaatgg cgtttaacga cataaaagac 3180 

agttgggcaa aagatgcaat agaagtatta gcatcaaggc acatagtaga aggtatgaca 3240 

gacactcagt atgaaccaaa caagacagta acgagagcgg aataa 3285 

<210> 144 
<211> 1094 
<212> PRT 
<213> Bacteria 

<220> 

<221> SIGNAL 
<222> CI)... (32) 

<400> 144 

Met ser Leu Lys lie Asn Lys lie lie Ser Phe lie lie Val Phe Ser 
1 5 10 15 

Met val Phe Gly Thr Leu Met Tyr Val Pro His Leu Lys Ala Phe Ala 

20 25 30 

Asp Asn Thr Gly lie Asn Leu Val ser Asn Gly Asp Phe Glu ser Gly 

35 40 45 

Thr lie Asp Gly Trp phe Lys Gin Gly Asn Pro Thr Leu Thr Val Thr 

50 55 60 

Thr Glu Gin Ala lie Gly Gin Tyr ser Met Lys Val Thr Gly Arg Thr 
65 70 75 80 

Gin Thr Tyr Glu Gly Pro Ala Tyr Ser Phe Leu Gly Lys Met Gin Lys 

85 90 95 

Gly Glu ser Tyr Asn val ser Leu Lys val Arg Leu val ser Gly Gin 

100 105 110 

Asn ser ser Asn Pro Leu lie Thr val Thr Met Phe Arg Glu Asp Asp 

115 120 125 

Asn Gly Asn His Tyr Asp Thr He val Trp Gin Lys Gin Val Ser Glu 

130 135 140 

Asp ser Trp Thr Thr val Ser Gly Thr Tyr Thr Leu Asp Tyr Thr Gly 
145 150 155 160 

Thr Leu Lys Thr Leu Tyr Met Tyr val Glu ser Pro Asp pro Thr Leu 

165 170 175 

Glu Tyr Tyr He Asp Asp val Val Val Thr Pro Gin Asn Pro Thr Gin 

180 185 190 

lie Gly Asn val Val Ala Asn Gly Thr Phe Glu Asn Glu Asn Thr ser 

195 200 205 

Gly Trp val Gly Thr Gly ser ser val val Lys Ala val Tyr Gly Asp 
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210 



215 



Ala His Ser Gly Asp Tyr Ser Leu Leu Thr Thr Gly Arg Thr Ala Asn 
225 230 235 240 

Trp Asn Gly Pro Ser Tyr Asp Leu Thr Gly Lys lie Val Pro Gly Gin 

245 250 255 

Gin Tyr Asn Val Asp Phe Trp val Lys Phe lie Asp Gly Asn Asp Thr 

260 265 270 

Glu Gin lie Lys Ala Thr Val Lys Ala Thr Ser Asp Lys Asp Asn Tyr 

275 280 285 

lie Gin Val Asn Asp Phe Ala Asp Val ser Lys Gly Glu Trp Thr Glu 

290 295 300 

lie Lys Gly ser Phe Thr Leu Pro val Ala Asp Tyr ser Gly lie ser 
305 310 315 320 

lie Tyr val Glu ser Gin Asn Pro Thr Leu Glu Phe Tyr lie Asp Asp 

325 330 335 

Phe Ser Val lie Gly Glu He Ala Asn Asn Gin lie Thr lie Gin Asn 

340 345 350 

Asp lie Pro Asp Leu Tyr Ser Val Phe Lys Asp Tyr Phe Pro lie Gly 

355 360 365 

Val Ala Val Asp Pro ser Arg Leu Asn Asp Thr Asp Pro His Ala Gin 

370 375 380 

Leu Thr Ala Lys His Phe Asn Met Leu Val Ala Glu Asn Ala Met Lys 
385 390 395 400 

pro Glu Ser Leu Gin Pro Thr Glu Gly Asn Phe Thr Phe Asp Asn Ala 

405 410 415 

Asp Lys lie Val Asp Tyr Ala lie ser His Asn Met Lys Met Arg Gly 

420 425 430 

His Thr Leu Leu Trp His Asn Gin val pro Asp Trp Phe Phe Gin Asp 

435 440 445 

Pro Ser Asp Pro Ser Lys Pro Ala ser Arg Asp Leu Leu Leu Gin Arg 

450 455 460 

Leu Lys Thr His lie Thr Thr Val Leu Asp His Phe Lys Thr Lys Tyr 
465 470 475 480 

Gly Ser Gin Asn Pro lie lie Gly Trp Asp Val Val Asn Glu Val Leu 

485 490 495 

Asp Asp Asn Gly ser Leu Arg Asn ser Lys Trp Leu Gin lie lie Gly 

500 505 510 

Pro Asp Tyr lie Glu Lys Ala Phe Glu Tyr Ala His Glu Ala Asp Pro 

515 520 525 

Ser Met Lys Leu Phe lie Asn Asp Tyr Asn lie Glu Asn Asn Gly Val 

530 535 540 

Lys Thr Gin Ala Met Tyr Asp Leu Val Lys Arg Leu Lys Ser Glu Gly 
545 550 555 560 

Val Pro lie Asp Gly lie Gly Met Gin Met His He Asn lie Asn Ser 

565 570 575 

Asn lie Asp Asn lie Lys Ala Ser lie Glu Lys Leu Ala Ser Leu Gly 

580 585 590 

Val Glu lie Gin Val Thr Glu Leu Asp Met Asn Met Asn Gly Asn lie 

595 600 605 

Ser Asn Glu Ala Leu Leu Lys Gin Ala Arg Leu Tyr Lys Gin Leu Phe 

610 615 620 

Asp Leu Phe Lys Ala Glu Lys Gin Tyr lie Thr Ala Val Val Phe Trp 
625 630 635 640 

Gly Val ser Asp Asp val Thr Trp Leu ser Lys Pro Asn Ala Pro Leu 

645 650 655 

Leu Phe Asp Thr Lys Leu Gin Ala Lys Pro Ala Tyr Trp Ala He Val 

660 665 670 

Asp Pro Asn Lys Ala Thr Pro Asp lie Gin ser Ala Lys Ala Leu Glu 

675 680 685 

Gly ser Pro Thr Met Gly Thr Asn Val Asp Asn Ser Trp Lys Leu val 

690 695 700 

Lys Pro Leu Tyr Ala Asn Thr Phe val Glu Gly Ser val Gly Ala Thr 
705 710 715 720 

Ala Ala Val Lys Ser Met Trp Asp Thr Lys Asn Leu Tyr Leu Leu Val 

725 730 735 

Gin Val Ser Asp Asn Thr Pro ser Ser Asn Asp Gly lie Glu lie Phe 

740 745 750 

val Asp Lys Asn Asp Asp Lys ser Thr ser Tyr Glu Thr Asp Asp Glu 




760 



765 
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His Tyr Thr lie Lys Arg Asp Gly Thr Gly ser Ser Asp lie Thr Lys 

770 775 780 

Tyr Val Thr Ser Asn Ala Asp Gly Tyr val Ala Gin Leu Ala lie Pro 
785 790 795 800 

lie Glu Asp lie Asn Pro Ala Leu Asn Asp Lys lie Gly Phe Asp lie 

805 810 815 

Arg lie Asn Asp Asp Lys Gly lie Gly Asn He Asp Ala lie Thr val 

820 825 830 

Trp Asn Asp Tyr Thr Asn ser Gin Asp Thr Asn Thr ser Tyr Phe Gly 

835 840 845 

Asp Leu Val Leu Ser Lys Pro Ala Gin lie Ala Thr Ala lie Tyr Gly 

850 855 860 

Thr Pro Val lie Asp Gly Lys Val Asp Asp lie Trp Asn Asn Val Glu 
865 870 875 880 

Ala lie ser Thr Asn Thr Trp Val Leu Gly Ser Asn Gly Ala Thr Ala 

885 890 895 

Thr Ala Lys Met Met Trp Asp Asp Lys Tyr Leu Tyr val Leu Ala Asp 

900 905 910 

Val Thr Asp Ser Asn Leu Asn Lys Ser ser Val Asn Pro Tyr Glu Gin 

915 920 925 

Asp Ser Val Glu Val Phe Val Asp Gin Asn Asn Asp Lys Thr Thr Tyr 

930 935 940 

Tyr Lys Asn Asp Asp Gly Gin Phe Arg val Asn Tyr Asp Asn Glu Gin 
945 950 955 960 

ser Phe Gly Gly ser Thr Asn ser Asn Gly Phe Lys ser Ala Thr ser 

965 970 975 

Leu Thr Gin Ser Gly Tyr He Val Glu Glu Ala lie Pro Trp Thr ser 

980 985 990 

He Thr Pro Ser Asn Gly Tfir lie lie Gly Phe Asp Leu Gin Val Asn 

995 1000 1005 

Asp Ala Asp Glu Asn Gly Lys Arg Thr Gly lie val Thr Trp cys Asp 

1010 1015 1020 

Pro ser Gly Asn Ser Trp Gin Asp Thr ser Gly Phe Gly Asn Leu Leu 
1025 1030 1035 1040 

Leu Thr Gly Lys Pro ser Gly Val Gly Thr Lys Arg Met Ala Phe Asn 

1045 1050 1055 

Asp lie Lys Asp Ser Trp Ala Lys Asp Ala lie Glu val Leu Ala Ser 

1060 1065 1070 

Arg His He val Glu Gly Met Thr Asp Thr Gin Tyr Glu Pro Asn Lys 

1075 1080 1085 

Thr val Thr Arg Ala Glu 
1090 

<210> 145 
<211> 1629 
<212> DNA 
<213> Eukaryote 

<400> 145 

atgaagattg tggatacaac ttccgcagag ataaagattg aaatggaacc tgaaaaagag 60 

atacctgctc tgaaagaagt actaaaagac tacttcaaag tcggagttgc actgccgtcc 120 

aaggtcttcc tcaacccgaa ggacatagaa ctcatcacga aacacttcaa cagcatcacc 180 

gcagaaaacg agatgaaacc ggatagtctg ctcgcgggca tcgaaaacgg taagctgaag 240 

ttcaggtttg aaacagcaga caaatacatt cagttcgtcg aggaaaacgg catggttata 300 

agaggtcaca cactggtgtg gcacaaccag acacccgact ggttcttcaa agacgaaaac 360 

ggaaacctcc tctccaaaga agcgatgacg gaaagactca aagagtacat ccacaccgtt 420 

gtcggacact tcaaaggaaa agtctacgca tgggacgtgg tgaacgaagc ggtcgatccg 480 

aaccagccgg atggactgag aagatccacc tggtaccaga tcatggggcc tgactacata 540 

gaactcgcct tcaagttcgc aagagaagca gatccagatg caaaactctt ctacaacgac 600 

tacaacacat tcgagcccag aaagagagat atcatctaca acctcgtgaa ggatctcaag 660 

gagaagggac tcatcgatgg gataggcatg cagtgtcaca tcagtcttgc aacagacatc 720 

aaacagatcg aagaggccat caaaaagttc agcaccatac ccggtataga aattcacatc 780 

acagaactcg atatgagtgt ctacagagat tccagttcca actacccaga ggcaccgagg 840 

acggcactca tcgaacaggc tcacaaaatg atgcagctct ttgagatttt caagaagtac 900 

agcaacgtga tcacgaacgt cacattctgg ggtctcaagg acgattactc ctggagagca 960 

acaagaagaa acgactggcc gctcatcttc gacaaagatc accaggcgaa actcgcttac 1020 

tgggcgatag tggcacctga ggtccttcca ccacttccaa aagaaagcag gatctccgaa 1080 

ggcgaggctg tggtagtggg gatgatggat gactcgtacc tgatgtcgaa gccgatagag 1140 

atccttgacg aagaagggaa cgtgaaggca acgatcaggg cggtgtggaa agacagcacg 1200 
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atctacatct acggagaggt acaggacaag acgaaaaaac cagcagaaga cggagtggcc 1260 

atattcatca acccgaacaa cgaaagaaca ccctatctgc agcctgatga cacctacgct 1320 

gtgctgtgga caaactggaa gacggaggtc aacagagaag acgtacaggt gaagaaattc 1380 

gttgggcctg gctttagaag atacagcttc gagatgtcga tcacgatacc gggtgtggag 1440 

ttcaagaaag acagctacat aggattcgac gctgcggtga tagacgacgg gaagtggtac 1500 

agctggagcg acacgacgaa cagccagaag acgaacacga tgaactacgg aacgctgaaa 1560 

ctcgaaggaa taatggtagc gacagcaaaa tacggaacac cggtcatcga tggagagata 1620 

gacgagtaa 1629 

<210> 146 
<211> 542 
<212> PRT 
<213> Eukaryote 

<400> 146 

Met Lys He val Asp Thr Thr Ser Ala Glu lie Lys lie Glu Met Glu 

15 10 15 

Pro Glu Lys Glu lie Pro Ala Leu Lys Glu val Leu Lys Asp Tyr Phe 

20 25 30 

Lys val Gly Val Ala Leu Pro Ser Lys Val Phe Leu Asn Pro Lys Asp 

35 40 45 

lie Glu Leu lie Thr Lys His Phe Asn ser lie Thr Ala Glu Asn Glu 

50 55 60 

Met Lys Pro Asp Ser Leu Leu Ala Gly lie Glu Asn Gly Lys Leu Lys 
65 70 75 80 

Phe Arg Phe Glu Thr Ala Asp Lys Tyr lie Gin Phe Val Glu Glu Asn 

85 90 95 

Gly Met val He Arg Gly His Thr Leu Val Trp His Asn Gin Thr Pro 

100 105 110 

Asp Trp Phe Phe Lys Asp Glu Asn Gly Asn Leu Leu ser Lys Glu Ala 

115 120 125 

Met Thr Glu Arg Leu Lys Glu Tyr lie His Thr Val val Gly His Phe 

130 135 140 

Lys Gly Lys Val Tyr Ala Trp Asp val val Asn Glu Ala val Asp Pro 
145 150 155 160 

Asn Gin Pro Asp Gly Leu Arg Arg ser Thr Trp Tyr Gin lie Met Gly 

165 170 175 

Pro Asp Tyr lie Glu Leu Ala Phe Lys Phe Ala Arg Glu Ala Asp Pro 

180 185 ~ 190 

Asp Ala Lys Leu Phe Tyr Asn Asp Tyr Asn Thr Phe Glu Pro Arg Lys 

195 200 205 

Arg Asp lie lie Tyr Asn Leu Val Lys Asp Leu Lys Glu Lys Gly Leu 

210 215 220 

lie Asp Gly lie Gly Met Gin Cys His lie Ser Leu Ala Thr Asp He 
225 230 235 240 

Lys Gin lie Glu Glu Ala lie Lys Lys Phe ser Thr lie Pro Gly lie 

245 250 255 

Glu lie His lie Thr Glu Leu Asp Met ser Val Tyr Arg Asp Ser Ser 

260 265 270 

Ser Asn Tyr Pro Glu Ala Pro Arg Thr Ala Leu He Glu Gin Ala His 

275 280 285 

Lys Met Met Gin Leu Phe Glu He Phe Lys Lys Tyr ser Asn Val lie 

290 295 300 

Thr Asn Val Thr Phe Trp Gly Leu Lys Asp Asp Tyr ser Trp Arg Ala 
305 310 315 320 

Thr Arg Arg Asn Asp Trp Pro Leu lie Phe Asp Lys Asp His Gin Ala 

325 330 335 

Lys Leu Ala Tyr Trp Ala lie Val Ala Pro Glu Val Leu Pro Pro Leu 

340 345 350 

Pro Lys Glu Ser Arg lie ser Glu Gly Glu Ala val val val Gly Met 

355 360 365 

Met Asp Asp Ser Tyr Leu Met Ser Lys Pro lie Glu lie Leu Asp Glu 

370 375 380 

Glu Gly Asn Val Lys Ala Thr He Arg Ala val Trp Lys Asp Ser Thr 
385 390 395 400 

He Tyr lie Tyr Gly Glu Val Gin Asp Lys Thr Lys Lys Pro Ala Glu 

405 410 415 

Asp Gly Val Ala lie Phe lie Asn Pro Asn Asn Glu Arg Thr Pro Tyr 
420 425 430 
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Leu Gin Pro Asp Asp Thr Tyr Ala Val Leu Trp Thr Asn Trp Lys Thr 

435 440 445 

Glu val Asn Arg Glu Asp val Gin val Lys Lys Phe val Gly Pro Gly 

450 455 460 

Phe Arg Arg Tyr Ser Phe Glu Met Ser lie Thr lie Pro Gly Val Glu 
465 470 475 480 

Phe Lys Lys Asp Ser Tyr lie Gly Phe Asp Ala Ala Val He Asp Asp 

485 490 495 

Gly Lys Trp Tyr ser Trp ser Asp Thr Thr Asn ser Gin Lys Thr Asn 

500 505 510 

Thr Met Asn Tyr Gly Thr Leu Lys Leu Glu Gly He Met val Ala Thr 

515 520 525 

Ala Lys Tyr Gly Thr Pro Val lie Asp Gly Glu lie Asp Glu 
530 535 540 

<210> 147 
<211> 1146 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 147 

atgactttct ctcgacggca atttttgctg caaacctccg ccggcctggc acttttgagc 60 

actgccaaaa tgcgcgcttt cgcccgcgca gtcgatgaag tgggccttaa agaccatttt 120 

aaagaccatt ttcatattgg gactgccatc agcggtcgac tgatgacgga aatgccggcc 180 

ttttaccgcg acctggttac ccgtgaattc agtgccatta ccatggaaaa cgacatgaaa 240 

tgggagcgtc tgcatcccaa agaaggccaa tgggattggg agattgccga caaattcgtc 300 

aattttggcg aagaaaacga catgtacatt gtcgggcatg ttctggtctg gcactcacag 360 

accccggatt gggtgttcca ggattccaga ggcaagccca tttctcgcga cgctttgctg 420 

aaacgcatgc gccaccagat tgaacagatg gcgggccgct ataagggccg ggtacacgcg 480 

tgggatgtgg tcaatgaggc ggtggacgag gaccaaggct ggcgcaaaag cccgtggttt 540 

aacattattg ggcccgagtt tatggagcac gccttcaatt acgcccacga agtggacccc 600 

gacgctcacc tgttgtacaa cgactacaat atgcacggtc gggaaaaacg cgaattcgtc 660 

ctggatttca tcaaaagata caagaaaaaa ggcattccga tccagggcat aggcatgcaa 720 

ggccatgtgg gcctgagctt tcccgatatc agcgagtttg agaaaagcct gcaagcctac 780 

gccaaacagg gcatgcggat gcacattacc gagctggata tggacgtgtt accagtggcc 840 

tgggatcaca ttggcgccga gatttccacc gagtttgact acgctgatga actggacccc 900 

tggcccaaag ggctgccgga agaagtcgaa caggaattta ccgatcgcta caccgctttc 960 

tttaaactgt ttttgaaata ccgcgatgat attgaaaggg tcaccttctg gggaaccgga 1020 

gatgcggaat cgtggaaaaa taatttccca gtaagggggc gcaccaacta cccgctgctg 1080 

tttgatcgcc gataccgcag aaaaccggcc tatgattcga ttgtcgaact gaccaaaaac 1140 

ctttaa " 1146 

<210> 148 
<211> 381 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(28) 

<400> 148 

Met Thr Phe ser Arg Arg Gin Phe Leu Leu Gin Thr Ser Ala Gly Leu 

1 5 10 15 

Ala Leu Leu ser Thr Ala Lys Met Arg Ala Phe Ala Arg Ala val Asp 

20 25 30 

Glu Val Gly Leu Lys Asp His Phe Lys Asp His Phe His lie Gly Thr 

35 40 45 

Ala lie Ser Gly Arg Leu Met Thr Glu Met Pro Ala Phe Tyr Arg Asp 

50 55 60 

Leu Val Thr Arg Glu Phe ser Ala He Thr Met Glu Asn Asp Met Lys 
65 70 75 80 

Trp Glu Arg Leu His Pro Lys Glu Gly Gin Trp Asp Trp Glu lie Ala 
85 90 95 
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Asp Lys Phe Val Asn Phe Gly Glu Glu Asn Asp Met Tyr lie Val Gly 

100 105 110 

His val Leu Val Trp His Ser Gin Thr pro Asp Trp Val Phe Gin Asp 

115 120 125 

ser Arg Gly Lys Pro lie Ser Arg Asp Ala Leu Leu Lys Arg Met Arg 

130 135 140 

His Gin lie Glu Gin Met Ala Gly Arg Tyr Lys Gly Arg val His Ala 
145 150 155 160 

Trp Asp val val Asn Glu Ala val Asp Glu Asp Gin Gly Trp Arg Lys 

165 170 175 

Ser Pro Trp Phe Asn lie lie Gly Pro Glu Phe Met Glu His Ala Phe 

180 185 190 

Asn Tyr Ala His Glu val Asp Pro Asp Ala His Leu Leu Tyr Asn Asp 

195 200 205 

Tyr Asn Met His Gly Arg Glu Lys Arg Glu Phe Val Leu Asp Phe lie 

210 215 220 

Lys Arg Tyr Lys Lys Lys Gly lie Pro lie Gin Gly lie Gly Met Gin 
225 230 235 240 

Gly His Val Gly Leu Ser phe Pro Asp lie Ser Glu Phe Glu Lys ser 

245 250 255 

Leu Gin Ala Tyr Ala Lys Gin Gly Met Arg Met His lie Thr Glu Leu 

260 265 270 

Asp Met Asp Val Leu Pro Val Ala Trp Asp His lie Gly Ala Glu lie 

275 280 285 

Ser Thr Glu Phe Asp Tyr Ala Asp Glu Leu Asp Pro Trp pro Lys Gly 

290 295 300 

Leu Pro Glu Glu Val Glu Gin Glu Phe Thr Asp Arg Tyr Thr Ala Phe 
305 310 315 320 

Phe Lys Leu Phe Leu Lys Tyr Arg Asp Asp He Glu Arg val Thr Phe 

325 330 335 

Trp Gly Thr Gly Asp Ala Glu Ser Trp Lys Asn Asn Phe Pro Val Arg 

340 345 350 

Gly Arg Thr Asn Tyr pro Leu Leu Phe Asp Arg Arg Tyr Arg Arg Lys 

355 360 365 

Pro Ala Tyr Asp Ser lie Val Glu Leu Thr Lys Asn Leu 
370 375 380 

<210> 149 
<211> 1044 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 149 

atgaagaagc tttttgtcgc ggtcgttttg ttgcccttag caactttttt cgcgtcggac 60 

ggattggagg gagaaccttt gagatcgtta gccgagaaac ttggcatcta catcggttac 120 

gcttcgatca accatttctg gactcttccg gattcaaaca agtacacaga agtggcaaag 180 

agggagttca acatactcac gccagagaac caaatgaagt gggacagcct tcacccagag 240 

cctgacaggt acaacttcac ttacgcagag cgtcatgtcg agttcgcttt ggaaaacaac 300 

atgctcgttc acggccacac actcgtttgg cacaaccaac ttccgttctg gttgaacaga 360 

cagtggacca aagaagaact cctgaaagtc cttgaggacc acatcaaaac agtcgtcggt 420 

cacttcaaag gaagggtgaa gatttgggac gtggtgaacg aagcggtcag cgacatgggc 480 

agttacagag agaccatttg gtacaagacc atcggacccg agtacatcga aaaggcattc 540 

gtgtgggcaa gacaagccga tccggaagcg atcctcatat acaacgacta caatatagaa 600 

acgatcaatc ccaaatcgaa tttcacctac cagctcatca aggagctgaa agaaaaaggt 660 

gtgccgatag acggcatcgg ttttcaaatg cacatagaca tcaacggaat aaactatgac 720 

agtttcagaa acaacctgaa gaggttcgct gatctcggtc tgaagctcta catcacggaa 780 

atggatgtga gaatacccaa gaacgcaact gaaaaagact tggacagaca ggcagaaatc 840 

tacgcgaaga tcttcgaaat ctgcttagag aatcctgcgg tccaagccat acagttctgg 900 

ggtttcacgg acaagtattc ctgggtgcct ggctttttca gcgggtacga tcatgcgctg 960 

atctttgaca gggactacag ccccaagccc gcgtattttg cgataaagag ggtgctcgaa 1020 

gccaaggtga gcaagggacg ctga 1044 

<210> 150 
<211> 347 
<212> PRT 
<213> unknown 
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<220> 



<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(18) 

<400> 150 

Met Lys Lys Leu Phe Val Ala Val Val Leu Leu Pro Leu Ala Thr Phe 
15 10 15 

Phe Ala ser Asp Gly Leu Glu Gly Glu Pro Leu Arg ser Leu Ala Glu 

20 25 30 

Lys Leu Gly He Tyr lie Gly Tyr Ala ser lie Asn His Phe Trp Thr 

35 40 45 

Leu Pro Asp Ser Asn Lys Tyr Thr Glu Val Ala Lys Arg Glu Phe Asn 

50 55 60 

lie Leu Thr Pro Glu Asn Gin Met Lys Trp Asp Ser Leu His Pro Glu 
65 70 75 80 

pro Asp Arg Tyr Asn Phe Thr Tyr Ala Glu Arg His val Glu Phe Ala 

85 90 95 

Leu Glu Asn Asn Met Leu Val His Gly His Thr Leu Val Trp His Asn 

100 105 110 

Gin Leu Pro Phe Trp Leu Asn Arg Gin Trp Thr Lys Glu Glu Leu Leu 

115 120 125 

Lys Val Leu Glu Asp His He Lys Thr Val Val Gly His Phe Lys Gly 

130 135 140 

Arg Val Lys lie Trp Asp val val Asn Glu Ala Val ser Asp Met Gly 
145 " 150 155 160 

Ser Tyr Arg Glu Thr lie Trp Tyr Lys Thr lie Gly Pro Glu Tyr lie 

165 170 175 

Glu Lys Ala Phe Val Trp Ala Arg Gin Ala Asp Pro Glu Ala lie Leu 

180 185 190 

lie Tyr Asn Asp Tyr Asn lie Glu Thr lie Asn Pro Lys ser Asn Phe 

195 200 205 

Thr Tyr Gin Leu lie Lys Glu Leu Lys Glu Lys Gly val Pro lie Asp 

210 215 220 

Gly lie Gly Phe Gin Met His lie Asp lie Asn Gly lie Asn Tyr Asp 
225 230 235 240 

Ser Phe Arg Asn Asn Leu Lys Arg Phe Ala Asp Leu Gly Leu Lys Leu 

245 250 255 

Tyr lie Thr Glu Met Asp Val Arg lie Pro Lys Asn Ala Thr Glu Lys 

260 265 270 

Asp Leu Asp Arg Gin Ala Glu lie Tyr Ala Lys lie Phe Glu lie Cys 

275 280 285 

Leu Glu Asn Pro Ala Val Gin Ala lie Gin Phe Trp Gly Phe Thr Asp 

290 295 300 

Lys Tyr ser Trp Val Pro Gly Phe Phe Ser Gly Tyr Asp His Ala Leu 
305 310 315 320 

lie Phe Asp Arg Asp Tyr ser Pro Lys Pro Ala Tyr Phe Ala He Lys 

325 330 335 

Arg Val Leu Glu Ala Lys Val Ser Lys Gly Arg 
340 345 

<210> 151 
<211> 1131 
<212> DNA 
<213> Unknown 



<220> 

<223> obtained from an environmental sample 



<4O0> 151 



atgcgatcta tgccacttta tgtgttgtta tgcagcgccc 
gcacaagacc aaaatgcttc tttaaaacag gcctttagca 
gccttaagtg ctacacaaat tcagggcaaa gagccgggca 
caatttaacg cggtgacggc agaaaacgtg atgaagtggg 
ggccagttca actttgctgc cgccgacgcc atgattgaat 
aaggtgatag gccatgtgct gttatggcac gaacaaacac 
gccaaaggcc aggccgcctc aaaggaactg gtgttatcac 



ttctgaccgg cagcctatat 
aaaactttag tattggcaca 
cactggaatt ggtaacacag 
aaatcattga acctgtggaa 
tcgccgaagc caatcatatc 
cagcctgggt atttctggac 
ggctaaaaaa ccatatcaat 



60 
120 
180 
240 
300 
360 
420 
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gccgtaatgg gccgctacaa aggccgtatt catggctggg atgcagtcaa cgaagcctta 480 

aatgaagacg gcactctgcg ccaatccaac tggtataaag ctttaggcga cgactatata 540 

gccacagtct ttgaactggc gcatcaggcc gacccgaaag ccgaactcta ttacaacgac 600 

ttcaatttat ttaaaccgga aaaacgcgct ggtgtactca aactggtggc agctttaaaa 660 

gcgaaaaatg tgcctatcca cggcataggc gagcaaggcc attacagcct ggattaccct 720 

gagctgcagc aagtagaaga ctctattgtg gcttttaaaa acactggcct gaaagtggtg 780 

attaccgaac tggatatctc agttttaccc ttccctgagc cagaaaagat tggtgctgat 840 

atctcactca atatgcagtt aaaacaagaa cttaatccct acgccgatgg cttacccaaa 900 

gaagtcagcg atcaactgac agaaaaatac ctgcaattat ttcagctatt tttacgccac 960 

agcgacgcca tcgaacgcgt gaccttatgg ggcgtaaacg acaaccaaac ctggcgcaac 1020 

aactggccaa tgaaaggcag aacagactac cccttactct tcgaccggaa aaaccagcca 1080 

aaagaagtgg ttcctgcatt gattaaactg gcggaaaaag ctggtaaata a " 1131 

<210> 152 
<211> 376 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (21) 

<400> 152 

Met Arg ser Met Pro Leu Tyr Val Leu Leu Cys Ser Ala Leu Leu Thr 

15 10 15 

Gly ser Leu Tyr Ala Gin Asp Gin Asn Ala ser Leu Lys Gin Ala Phe 

20 25 30 

Ser Lys Asn Phe Ser lie Gly Thr Ala Leu Ser Ala Thr Gin lie Gin 

35 40 45 

Gly Lys Glu Pro Gly Thr Leu Glu Leu Val Thr Gin Gin Phe Asn Ala 

50 55 60 

val Thr Ala Glu Asn Val Met Lys Trp Glu lie lie Glu Pro Val Glu 
65 70 75 80 

Gly Gin Phe Asn phe Ala Ala Ala Asp Ala Met He Glu phe Ala Glu 

85 90 95 

Ala Asn His lie Lys Val lie Gly His Val Leu Leu Trp His Glu Gin 

100 105 110 

Thr Pro Ala Trp Val Phe Leu Asp Ala Lys Gly Gin Ala Ala Ser Lys 

115 120 125 

Glu Leu val Leu ser Arg Leu Lys Asn His lie Asn Ala Val Met Gly 

130 135 140 

Arg Tyr Lys Gly Arg lie His Gly Trp Asp Ala val Asn Glu Ala Leu 
145 150 155 160 

Asn Glu Asp Gly Thr Leu Arg Gin ser Asn Trp Tyr Lys Ala Leu Gly 

165 170 175 

Asp Asp Tyr lie Ala Thr Val Phe Glu Leu Ala His Gin Ala Asp Pro 

180 185 190 

Lys Ala Glu Leu Tyr Tyr Asn Asp Phe Asn Leu Phe Lys Pro Glu Lys 

195 200 205 

Arg Ala Gly Val Leu Lys Leu Val Ala Ala Leu Lys Ala Lys Asn val 

210 215 220 

Pro He His Gly lie Gly Glu Gin Gly His Tyr Ser Leu Asp Tyr Pro 
225 230 235 240 

Glu Leu Gin Gin val Glu Asp Ser lie val Ala Phe Lys Asn Thr Gly 

245 250 255 

Leu Lys val Val lie Thr Glu Leu Asp lie Ser Val Leu Pro Phe Pro 

260 265 270 

Glu Pro Glu Lys lie Gly Ala Asp lie Ser Leu Asn Met Gin Leu Lys 

275 280 285 

Gin Glu Leu Asn Pro Tyr Ala Asp Gly Leu Pro Lys Glu val Ser Asp 

290 295 300 

Gin Leu Thr Glu Lys Tyr Leu Gin Leu Phe Gin Leu Phe Leu Arg His 
305 310 315 320 

ser Asp Ala He Glu Arg val Thr Leu Trp Gly val Asn Asp Asn Gin 

325 330 335 

Thr Trp Arg Asn Asn Trp Pro Met Lys Gly Arg Thr Asp Tyr Pro Leu 
340 345 K 350 
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Leu Phe Asp Arg Lys Asn Gin Pro Lys Glu val val Pro Ala Leu lie 

355 360 365 

Lys Leu Ala Glu Lys Ala Gly Lys 

370 375 

<210> 153 
<211> 1020 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 153 

atgggtgcta tgggcctggc ggcgctgtat tcgctgccag ccaatgcaca gacctgcatt 60 

acgcagagtc agacgggcac caacaacggc cactattttt cgttctggaa ggacaatccg 120 

ggaacggtca atttctgtat gtatgccaac ggccgttaca cgtctaactg gaacggcatc 180 

aacaattggg tcggcggcaa aggctggcaa accggctcgc gcagaaacgt cacctactct 240 

ggctcgttca actctcccgg caatggctat ctggctgctc tactggctgg accaccaatc 300 

ctgttggtcg agtactacat catcgagagc tggggaaatt ggcgcccgcc gggttcggat 360 

ggaacattgt taggcaccgt cactagcgac ggcggtactt acgatatcta tcgctcgcgc 420 

cgcaccaacg cgccttgtat cactggcaac tcctgtaact tcgatcagta ctggagcgta 480 

cggcaatcca agcgcgtggg cggcacgatt accacgggca atcacttcga cgcttgggcg 540 

gcacgcggct tgaacctcgg cacgcacaac taccaagtga tggcgaccga gggatatcag 600 

agcaacggca gctccgacat caccattagc gacaacccgg gaccgacgcc aggacccact 660 

ccgaacccga atcccacgcc gggcaccaag aatttcacgg tgcgcgcgcg cggaaccgcg 720 

gggggtgagt ccatcacgct gcgtgtgaac aatcagaacg tgcagacctg gacgctgtcg 780 

accagctacc agaacttcac ggcgtccacg acgttgagtg gtggcatcac ggtcgcgttc 840 

accaatgatg gtggtagtcg agacgttcag gtggattaca tccaggtgaa cggcgcaact 900 

cgacaatccg agagccagac gtacaacacc ggcctctatg ccaacggcag ttgcggcggc 960 

ggctcgaaca gcgagtggat gcattgcaat ggagcgatcg gctacggcaa cacgccgtag 1020 

<210> 154 

<211> 339 

<212> PRT 

<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> signal 
<222> (1)...(16) 

<400> 154 

Met Gly Ala Met Gly Leu Ala Ala Leu Tyr Ser Leu Pro Ala Asn Ala 

15 10 15 

Gin Thr cys He Thr Gin ser Gin Thr Gly Thr Asn Asn Gly His Tyr 

20 25 30 

Phe ser Phe Trp Lys Asp Asn Pro Gly Thr Val Asn Phe cys Met Tyr 

35 40 45 

Ala Asn Gly Arg Tyr Thr ser Asn Trp Asn Gly lie Asn Asn Trp Val 

50 55 60 

Gly Gly Lys Gly Trp Gin Thr Gly Ser Arg Arg Asn Val Thr Tyr Ser 
65 70 75 80 

Gly ser Phe Asn ser Pro Gly Asn Gly Tyr Leu Ala Ala Leu Leu Ala 

85 90 95 

Gly Pro Pro lie Leu Leu val Glu Tyr Tyr lie lie Glu Ser Trp Gly 

100 105 110 

Asn Trp Arg Pro Pro Gly Ser Asp Gly Thr Leu Leu Gly Thr Val Thr 

115 120 125 

Ser Asp Gly Gly Thr Tyr Asp He Tyr Arg ser Arg Arg Thr Asn Ala 

130 135 140 

pro cys lie Thr Gly Asn ser Cys Asn Phe Asp Gin Tyr Trp ser Val 
145 150 155 160 

Arg Gin Ser Lys Arg Val Gly Gly Thr lie Thr Thr Gly Asn His Phe 

165 170 175 

Asp Ala Trp Ala Ala Arg Gly Leu Asn Leu Gly Thr His Asn Tyr Gin 

180 w 185 190 

val Met Ala Thr Glu Gly Tyr Gin ser Asn Gly Ser Ser Asp lie Thr 
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195 200 205 

lie Ser Asp Asn Pro Gly Pro Thr Pro Gly Pro Thr Pro Asn Pro Asn 

210 215 220 

Pro Thr Pro Gly Thr Lys Asn Phe Thr Val Arg Ala Arg Gly Thr Ala 
225 230 235 240 

Gly Gly Glu ser lie Thr Leu Arg Val Asn Asn Gin Asn val Gin Thr 

245 250 255 

Trp Thr Leu Ser Thr Ser Tyr Gin Asn Phe Thr Ala ser Thr Thr Leu 

260 * 265 270 

Ser Gly Gly lie Thr Val Ala Phe Thr Asn Asp Gly Gly ser Arg Asp 

275 280 285 

Val Gin Val Asp Tyr lie Gin Val Asn Gly Ala Thr Arg Gin ser Glu 

290 295 300 

ser Gin Thr Tyr Asn Thr Gly Leu Tyr Ala Asn Gly Ser cys Gly Gly 
305 310 315 320 

Gly ser Asn Ser Glu Trp Met His Cys Asn Gly Ala lie Gly Tyr Gly 
325 330 335 

Asn Thr Pro 



<210> 155 
<211> 1836 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 155 

atgaaaggat taattgcggc agcgcttgct ggcttggcat tcggggcctc cctatcctgg 60 

ggacagtgca caacgtttac caccagtacc attcagaatt gtaatggcat tgattacgag 120 

ctctggagtc agaataacaa gggcaccgta agcatgaaga ttacgggagg gagcacgaat 180 

ccgaatggag gaactttcga tgctacctgg aatggcaccg agaatatcct ggctagagct 240 

ggtaagaaat ggggctcgtc cagcactacc acccccacgt ccgcaggcaa tattactctt 300 

gaattcgcgg cgacatggtc ctcaagcgat aacgtaaaaa tgcttggagt ctatggctgg 360 

gcgtactatc caactggaag tatcccgact aaacaggaaa atggagcaag tacctcattc 420 

acaaatcaaa ttgagtacta catcatccag gatcgtggta gctataatgc tgcatcgggt 480 

ggaacgaact ccaaaaaata cggcgaaggg acgatcgatg gaattctgta tgaattctat 540 

atcgcagaca gaatcaacca gcctgatctg tcaggaaaga gtggaaactt caagcaatac 600 

ttcagcgtcc cgaaaagtac gagcagccat aggcaaagtg ggacgattac cgtttccaaa 660 

catttccagg cctgggaaaa tgccggaatg aaaatgatgt cctgtcgctt gtatgaagtc 720 

gcaatgaaag tcgagtccta taccggttct gcgaccggtg ttggctctgc gaaggttaca 780 

aagaatatac tcaccattgg tggaatcttg agcagtagca gtactgcaag cagtagcagc 840 

acagtaagta gcagtagcag caatgcatat acgcttgtca cgaatgtttc tcccgctgga 900 

gccggaacag tgaccaggag ccccaatact gcgacctatg ccccgaatgc ttcagtacag 960 

cttactgcaa cgccgagtac cggttggaaa tttgtcggtt gggctgggga tcttacgtca 1020 

actacgagta ctgctaccgt caccatgacc aaagatatta ccgcaactgc aaaatttgaa 1080 

ctggtatcgg gagatggcac gaccaacttg atcaaggatg gaaacttccc cagtagcagc 1140 

gtcatctcca caggtgatgg cacctcctgg aagctcgggc aaggtacaaa ctggggtaat 1200 

tccgcagcaa cgacgagtgt cagcaatgga atcgcgactg tcaatgtgac caccattgga 1260 

tctcaaacct atcaacccca gctaattcag tataacgtgg ctctttacaa ggatatgagc 1320 

tacaagctca ccttcaaggc aaaagctgct gctgcaagga aaattgaagt cgcattccaa 1380 

cagtcggtgg acccatgggc tggatatgct tccaaggaat tcgatcttac aacgacagag 1440 

cagacatatg agttcgtatt taaaatgact agcgctactg acacggcttc acagttcgcg 1500 

ttcaatctcg gccaggcaac aggcgccgtc aatattagtg atgtaaagct agtatatacg 1560 

acagctggta caacacccgt attccgtgga tataatgagg cggcaacaca ggagaggcct 1620 

gtattcatat ccttggatgg taggacgttg aacattgttc cagtgtatgg agccaaactg 1680 

caggtcaagt tagtggacat caatggtaag atgagagcct ccttcaatgt ggtcggaatt 1740 

gcttccatcc cgctgtccaa tatccccgct gggcggtatt atattgacgt aagtggtgac 1800 

ggcgttaagc aggcatcccc gatagttctg gaataa 1836 

<210> 156 
<211> 611 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
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<221> SIGNAL 
<222> CI)... (21) 

<400> 156 

Met Lys Gly Leu lie Ala Ala Ala Leu Ala Gly Leu Ala Phe Gly Ala 

15 10 15 

Ser Leu Ser Trp Gly Gin Cys Thr Thr Phe Thr Thr Ser Thr lie Gin 

20 25 30 

Asn Cys Asn Gly lie Asp Tyr Glu Leu Trp Ser Gin Asn Asn Lys Gly 

35 40 45 

Thr val ser Met Lys lie Thr Gly Gly Ser Thr Asn Pro Asn Gly Gly 

50 55 60 

Thr Phe Asp Ala Thr Trp Asn Gly Thr Glu Asn lie Leu Ala Arg Ala 
65 70 75 80 

Gly Lys Lys Trp Gly ser ser Ser Thr Thr Thr Pro Thr Ser Ala Gly 

85 90 95 

Asn He Thr Leu Glu Phe Ala Ala Thr Trp Ser Ser Ser Asp Asn Val 

100 105 110 

Lys Met Leu Gly val Tyr Gly Trp Ala Tyr Tyr Pro Thr Gly ser lie 

115 120 125 

Pro Thr Lys Gin Glu Asn Gly Ala Ser Thr ser Phe Thr Asn Gin lie 

130 135 140 

Glu Tyr Tyr lie lie Gin Asp Arg Gly Ser Tyr Asn Ala Ala Ser Gly 
145 150 155 160 

Gly Thr Asn Ser Lys Lys Tyr Gly Glu Gly Thr He Asp Gly lie Leu 

165 170 175 

Tyr Glu Phe Tyr lie Ala Asp Arg lie Asn Gin Pro Asp Leu ser Gly 

180 185 190 

Lys Ser Gly Asn Phe Lys Gin Tyr Phe Ser Val Pro Lys Ser Thr ser 

195 200 205 

Ser His Arg Gin Ser Gly Thr lie Thr val Ser Lys His Phe Gin Ala 

210 215 220 

Trp Glu Asn Ala Gly Met Lys Met Met ser cys Arg Leu Tyr Glu val 
225 230 235 240 

Ala Met Lys val Glu ser Tyr Thr Gly ser Ala Thr Gly val Gly Ser 

245 250 255 

Ala Lys Val Thr Lys Asn lie Leu Thr lie Gly Gly lie Leu Ser Ser 

260 265 270 

Ser ser Thr Ala Ser Ser Ser Ser Thr val Ser Ser Ser Ser Ser Asn 

275 280 285 

Ala Tyr Thr Leu val Thr Asn val Ser Pro Ala Gly Ala Gly Thr val 

290 295 300 

Thr Arg ser Pro Asn Thr Ala Thr Tyr Ala Pro Asn Ala Ser Val Gin 
305 310 315 320 

Leu Thr Ala Thr Pro Ser Thr Gly Trp Lys Phe Val Gly Trp Ala Gly 

325 330 335 

Asp Leu Thr Ser Thr Thr ser Thr Ala Thr Val Thr Met Thr Lys Asp 

340 345 350 

lie Thr Ala Thr Ala Lys Phe Glu Leu val ser Gly Asp Gly Thr Thr 

355 360 365 

Asn Leu lie Lys Asp Gly Asn Phe Pro Ser ser Ser val lie ser Thr 

370 375 380 

Gly Asp Gly Thr Ser Trp Lys Leu Gly Gin Gly Thr Asn Trp Gly Asn 
385 390 395 400 

Ser Ala Ala Thr Thr Ser Val Ser Asn Gly lie Ala Thr val Asn Val 

405 410 415 

Thr Thr lie Gly ser Gin Thr Tyr Gin pro Gin Leu lie Gin Tyr Asn 

420 425 430 

Val Ala Leu Tyr Lys Asp Met Ser Tyr Lys Leu Thr Phe Lys Ala Lys 

435 440 445 

Ala Ala Ala Ala Arg Lys lie Glu Val Ala Phe Gin Gin Ser Val Asp 

450 455 460 

Pro Trp Ala Gly Tyr Ala Ser Lys Glu Phe Asp Leu Thr Thr Thr Glu 
465 470 475 480 

Gin Thr Tyr Glu Phe Val phe Lys Met Thr ser Ala Thr Asp Thr Ala 

485 490 49S 

Ser Gin Phe Ala Phe Asn Leu Gly Gin Ala Thr Gly Ala val Asn lie 

500 505 510 

Ser Asp val Lys Leu Val Tyr Thr Thr Ala Gly Thr Thr Pro Val Phe 
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515 520 525 

Arg Gly Tyr Asn Glu Ala Ala Thr Gin Glu Arg Pro Val Phe lie Ser 

530 535 540 

Leu Asp Gly Arg Thr Leu Asn lie Val Pro Val Tyr Gly Ala Lys Leu 
545 550 555 560 

Gin Val Lys Leu Val Asp lie Asn Gly Lys Met Arg Ala Ser Phe Asn 

565 570 575 

val Val Gly He Ala ser lie Pro Leu ser Asn lie Pro Ala Gly Arg 

580 585 590 

Tyr Tyr lie Asp Val Ser Gly Asp Gly val Lys Gin Ala Ser Pro lie 

595 600 605 

Val Leu Glu 
610 

<210> 157 
<211> 645 
<212> DNIA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 157 

atgtttaagt taagtaagaa aattttgatg gtgttattaa caatttcaat gagttttatt 60 

agcttatttg cagtaaccgc gtatgcagct tcgacagact actggcaaaa ttggactgat 120 

ggtggtggga cagtaaatgc taccaatgga tctgatggca attacagtgt ttcatggtca 180 

aattgcggga attttgttgt tggtaaaggc tggactaccg gatcagcaac tagggtaata 240 

aactataatg ccggagcctt ttcgccgtcc ggcaatggat atttagctct ttatgggtgg 300 

acgagaaatt cactcataga atattacgtc gttgatagct gggggactta tagacctact 360 

ggaacttata aaggcactgt gactagtgat ggagggacat atgacatata cacgactaca 420 

cgaaccaacg caccttccat tgacggcaat aatacaaatt tcacccagtt ctggagtgtt 480 

aggcagtcaa agagaccgat tggtaccaac aataccatca cttttagcaa ccacgttaac 540 

gcctggaaga gtaaaggaat gaatctgggg agtagttggg cttatcaggt attagcgaca 600 

gagggatatc aaagtagtgg gtactctaac gtaacggtct ggtaa 645 

<210> 158 

<211> 214 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (1)...(29) 

<400> 158 

Met Phe Lys Leu Ser 

1 5 
Met ser Phe lie ser 
20 

Asp Tyr Trp Gin Asn 
35 

Asn Gly ser Asp Gly 
50 

Phe Val Val Gly Lys 
65 

Asn Tyr Asn Ala Gly 
85 

Leu Tyr Gly Trg Thr 

ser Trp Gly Thr Tyr 
115 

Ser Asp Gly Gly Thr 
130 

Pro ser lie Asp Gly 
145 

Arg Gin ser Lys Arg 
165 



an environmental sample 



Lys Lys lie Leu Met Val Leu Leu Thr lie ser 

10 15 
Leu Phe Ala Val Thr Ala Tyr Ala Ala Ser Thr 

25 30 
Trp Thr Asp Gly Gly Gly Thr val Asn Ala Thr 

40 45 
Asn Tyr ser val ser Trp ser Asn cys Gly Asn 

55 60 
Gly Trp Thr Thr Gly Ser Ala Thr Arg Val lie 
70 75 ~ 80 

Ala Phe Ser Pro Ser Gly Asn Gly Tyr Leu Ala 

90 95 
Arg Asn Ser Leu lie Glu Tyr Tyr Val Val Asp 

105 110 
Arg pro Thr Gly Thr Tyr Lys Gly Thr Val Thr 

120 125 
Tyr Asp lie Tyr Thr Thr Thr Arg Thr Asn Ala 

135 140 
Asn Asn Thr Asn Phe Thr Gin Phe Trp ser Val 
150 155 160 

Pro lie Gly Thr Asn Asn Thr lie Thr Phe ser 
170 175 
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Asn His Val Asn Ala Trp Lys Ser Lys Gly Met Asn Leu Gly Ser Ser 

180 185 190 

Trp Ala Tyr Gin val Leu Ala Thr Glu Gly Tyr Gin ser ser Gly Tyr 

195 200 205 

Ser Asn Val Thr Val Trp 
210 

<210> 159 
<211> 1041 
<212> DMA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 159 

atgatcagtc tcaaacgagt ggcggcgctc ctgtgcgtcg caggtctggg catgtctgcg 60 

gcaaacgcgc agacctgcct cacgtcgagt caaaccggca ctaacaatgg cttctattat 120 

tccttctgga aggacagtcc gggcacggtg aatttttgcc tgcagtccgg cggccgttac 180 

acatcgaact ggagcggcat caacaactgg gtgggcggca agggatggca gaccggttca 240 

cgccggaaca tcacgtactc gggcagcttc aattcaccgg gcaacggcta cctggcgctt 300 

tacggatgga ccaccaatcc actcgtcgag tactacgtcg tcgatagctg ggggagctgg 360 

cgtccgccgg gttcggacgg aacgttcctg gggacggtca acagcgatgg cggaacgtat 420 

gacatctatc gcgcgcagcg ggtcaacgcg ccgtccatca tcggcaacgc cacgttctat 480 

caatactgga gcgttcggca gtcgaagcgg gtaggtggga cgatcaccac cggaaaccac 540 

ttcgacgcgt gggccagcgt gggcctgaac ctgggcactc acaactacca gatcatggcg 600 

accgagggct accaaagcag cggcagctcc gacatcacgg tgagtgaagg cggtagcagc 660 

agtggtggcg gaagcagcac gagcagcagc agcggcggtg gtggcaccaa gagcttcacg 720 

gttcgtgcgc gcggtaccgc gggcggtgag tccatcacgc tgcgcgtgaa caaccagaac 780 

gtgcagacct ggacgctggg caccagcatg acgaactaca cggcgtcgac ttcactgagc 840 

ggcggcatca ccgtggtgta cacgaacgac agcggtaacc gcgacgtgca ggtggactac 900 

atcgtcgtga acggccagac gcgccagtcc gaagcccaga gctacaacac cggcctttat 960 

gcgaacgggc gttgcggcgg tggctccaac agcgaatgga tgcattgcaa cggcgccatc 1020 

ggctacggca atacaccgta a 1041 

<210> 160 
<211> 346 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (23) 

<400> 160 

Met lie Ser Leu Lys Arg val Ala Ala Leu Leu Cys Val Ala Gly Leu 

1 5 10 15 

Gly Met Ser Ala Ala Asn Ala Gin Thr Cys Leu Thr ser ser Gin Thr 

20 25 30 

Gly Thr Asn Asn Gly Phe Tyr Tyr ser Phe Trp Lys Asp ser Pro Gly 

35 40 45 

Thr Val Asn Phe Cys Leu Gin Ser Gly Gly Arg Tyr Thr ser Asn Trp 

50 55 60 

Ser Gly lie Asn Asn Trp Val Gly Gly Lys Gly Trp Gin Thr Gly Ser 
65 70 75 80 

Arg Arg Asn He Thr Tyr Ser Gly Ser Phe Asn Ser Pro Gly Asn Gly 

85 90 95 

Tyr Leu Ala Leu Tyr Gly Trp Thr Thr Asn Pro Leu val Glu Tyr Tyr 

100 K 105 110 

Val val Asp Ser Trp Gly ser Trp Arg pro Pro Gly ser Asp Gly Thr 

115 120 125 

Phe Leu Gly Thr val Asn ser Asp Gly Gly Thr Tyr Asp lie Tyr Arg 

130 135 140 

Ala Gin Arg val Asn Ala Pro Ser He lie Gly Asn Ala Thr Phe Tyr 
145 150 155 160 

Gin Tyr Trp ser val Arg Gin Ser Lys Arg val Gly Gly Thr lie Thr 
165 170 175 
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Thr Gly Asn His Phe Asp Ala Trp Ala ser Val Gly Leu Asn Leu Gly 

180 185 190 

Thr His Asn Tyr Gin lie Met Ala Thr Glu Gly Tyr Gin ser ser Gly 

195 200 205 

Ser Ser Asp lie Thr val ser Glu Gly Gly Ser Ser Ser Gly Gly Gly 

210 215 220 

Ser Ser Thr Ser Ser Ser Ser Gly Gly Gly Gly Thr Lys Ser phe Thr 
225 230 235 240 

Val Arg Ala Arg Gly Thr Ala Gly Gly Glu Ser lie Thr Leu Arg Val 

245 250 255 

Asn Asn Gin Asn Val Gin Thr Trp Thr Leu Gly Thr Ser Met Thr Asn 

260 265 270 

Tyr Thr Ala Ser Thr ser Leu ser Gly Gly lie Thr Val Val Tyr Thr 

275 280 285 

Asn Asp Ser Gly Asn Arg Asp Val Gin Val Asp Tyr lie Val val Asn 

290 295 300 

Gly Gin Thr Arg Gin ser Glu Ala Gin ser Tyr Asn Thr Gly Leu Tyr 
305 310 315 320 

Ala Asn Gly Arg Cys Gly Gly Gly ser Asn ser Glu Trp Met His Cys 

325 330 335 

Asn Gly Ala lie Gly Tyr Gly Asn Thr Pro 
340 345 

<210> 161 
<211> 1047 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 161 

atgttcaaag gtcttttgaa atcggtcctc accggcaagc gagccggtgc ggtgttcatc 60 

tgtctggccg gactgtggat gacacaggcg caggcgcaga cgtgcatcgg ttcaccacaa 120 

acgggcaaca acggcggctt cttcttttcg ttctggaaag acaatccggg gtcggtgaat 180 

ttctgcatgt actccggcgg tcgctatacc tccagctgga gcggcatcaa caactgggta 240 

ggtgggaagg gctggcaaac cggttcatcc cgcacggtga cgtattcggg cacgttcaac 300 

tcgccgggaa acggctacct gactctgtac ggatggacca ccaatccgct ggtcgagtac 360 

tacatcgtgg acagctgggg cagctaccgt ccgcctggag gccagggctt catgggcacg 420 

gtcaccagcg acggcggaac gtatgacatc taccgggttc gccgcaccaa tgcgccgtgc 480 

atcacaggca acaactgcaa cttcgaccag tactggagcg tgcgtcagtc gaggcgggtg 540 

ggcggcacca tcaccaccgc caaccatttc aacgcgtggc gtacgctcgg catgaatctc 600 

gggcagcaca actaccaggt gatggcgacc gaaggattcc agagcagtgg cagctcggac 660 

atcaccgtga gcgaaggatc tggcggtggc ggcggaggtg gcggcggtgg caccaagagc 720 

ttcacggtgc gcgcgcgcgg caccgcgggc ggcgagtcca tcacgctgcg cgtcaacaac 780 

caggtcgtgc agagctggac cttgagcacc agcatgcaga actacacggc ctcgaccacg 840 

atgagcggcg gcatcacggt gaacttcacc aacgacggca ccaaccgcga cgtgcaggtg 900 

gactacatca tcgtgaatgg ccagacgcgt cagtccgaag cgcagacgta caacaccggg 960 

ctgtacgcca acggccgttg cggtggcggg tcgaacagcg agtggatgca ttgcaatggc 1020 

gcgatcgggt acggcgacac gccctga 1047 

<210> 162 

<211> 348 

<212> PRT 

<213> unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (I)... (32) 

<400> 162 

Met Phe Lys Gly Leu Leu Lys Ser val Leu Thr Gly Lys Arg Ala Gly 

1 . . , 5 10 !5 

Ala val Phe He cys Leu Ala Gly Leu Trp Met Thr Gin Ala Gin Ala 

20 25 30 

Gin Thr Cys He Gly ser pro Gin Thr Gly Asn Asn Gly Gly Phe Phe 
35 40 45 
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Phe ser Phe Trp Lys Asp Asn Pro Gly Ser Val Asn Phe Cys Met Tyr 

50 55 60 

ser Gly Gly Arg Tyr Thr ser ser Trp Ser Gly lie Asn Asn Trp Val 
65 70 75 80 

Gly Gly Lys Gly Trp Gin Thr Gly Ser ser Arg Thr Val Thr Tyr Ser 

, § . 85 90 95 

Gly Thr Phe Asn Ser Pro Gly Asn Gly Tyr Leu Thr Leu Tyr Gly Trp 

100 105 110 

Thr Thr Asn Pro Leu val Glu Tyr Tyr lie Val Asp Ser Trp Gly Ser 

115 120 125 

Tyr Arg pro Pro Gly Gly Gin Gly Phe Met Gly Thr Val Thr ser Asp 

130 135 140 

Gly Gly Thr Tyr Asp lie Tyr Arg Val Arg Arg Thr Asn Ala Pro Cys 
145 150 155 160 

lie Thr Gly Asn Asn Cys Asn Phe Asp Gin Tyr Trp Ser Val Arg Gin 

165 170 175 

ser Arg Arg val Gly Gly Thr lie Thr Thr Ala Asn His Phe Asn Ala 

180 185 190 

Trp Arg Thr Leu Gly Met Asn Leu Gly Gin His Asn Tyr Gin val Met 

195 200 205 

Ala Thr Glu Gly Phe Gin Ser Ser Gly ser Ser Asp lie Thr Val ser 

210 215 220 

Glu Gly ser Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Thr Lys Ser 
225 230 235 240 

Phe Thr val Arg Ala Arg Gly Thr Ala Gly Gly Glu Ser lie Thr Leu 

245 250 255 

Arg Val Asn Asn Gin val val Gin Ser Trp Thr Leu Ser Thr ser Met 

260 265 270 

Gin Asn Tyr Thr Ala Ser Thr Thr Met ser Gly Gly lie Thr Val Asn 

275 280 285 

Phe Thr Asn Asp Gly Thr Asn Arg Asp Val Gin Val Asp Tyr lie lie 

290 295 300 

Val Asn Gly Gin Thr Arg Gin Ser Glu Ala Gin Thr Tyr Asn Thr Gly 
305 310 315 320 

Leu Tyr Ala Asn Gly Arg cys Gly Gly Gly ser Asn Ser Glu Trp Met 

325 330 335 

His Cys Asn Gly Ala lie Gly Tyr Gly Asp Thr Pro 
340 J 345 

<210> 163 
<211> 1068 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 163 

atgaaagcaa agagaatgaa gttgtttgcc gcatttttac tctgttttac gcttgcactt 60 

cctggggcag tgcatgcgca gacgatcacc agcaattcgg tcggtacgca tgacggttat 120 

gactatgaat actggaagga cagcgggaat ggaactatgg ttctcggtag tggcggtacg 180 

ttcagtgccg agtggagcaa tatcaataat attctgttcc gtaaaggcaa gaagttcaat 240 

gagacgcaga cccatcagca aattggaaac atttccataa cctatggtgc cacctaccaa 300 

ccgaatggca attcgtattt aacggtctat ggctggacgg ttgaccccct cgtcgaatat 360 

tacattgtcg atagctgggg cagctggcgt ccgcctggag catcgccaaa ggggactgtt 420 

aacgttgacg gaggaacgta tgacatttat gagacaactc gtgtcaacca gccttccatt 480 

aaaggcacgg caaccttcaa gcagtattgg agtgtccgga cgtcaaaacg gacgagcgga 540 

accatatctg taagcgagca ctttaaggcc tgggagaaat tggggatgac catgggcaag 600 

atgtatgaag tcgcgcttac ggttgaaggc tatcaaagca gtggaagcgc taatgtgtat 660 

agccatacac tgacgatcgg cgggggaaca acacctccac caaccacagg cacaaagatc 720 

gaagccgaga gtatgaccaa aagcggacaa tacactggga atatcagctc gccgttcaac 780 

ggagtcgctt tgtatgccaa caatgattcc gtgaaattca cgcataattt cacgaccggc 840 

acccataact tctcactccg gggggcatca aacaactcca atatggcccg ggttgacctg 900 

aaaatcggcg ggcagacgaa ggggaccttc tatttcggcg gaagcagccc tgcggtctat 960 

actctgaata atgtcagcca tggaaccgga aatcaagagg ttgaactcgt tgtaaccgcc 1020 

gataacggaa catgggatgc tttcattgat tatctcgaga tccattaa 1068 

<210> 164 
<211> 355 

page 129 



WO 03/106654 



PCT/US03/19153 



<212> PRT 

<213> unknown 
<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C26) 

<400> 164 

Met Lys Ala Lys Arg Met Lys Leu Phe Ala Ala Phe Leu Leu cys Phe 

15 10 15 

Thr Leu Ala Leu pro Gly Ala Val His Ala Gin Thr lie Thr Ser Asn 

20 25 30 

ser Val Gly Thr His Asp Gly Tyr Asp Tyr Glu Tyr Trp Lys Asp Ser 

35 40 45 

Gly Asn Gly Thr Met Val Leu Gly ser Gly Gly Thr Phe ser Ala Glu 

50 55 60 

Trp ser Asn He Asn Asn lie Leu Phe Arg Lys Gly Lys Lys Phe Asn 
65 70 75 80 

Glu Thr Gin. Thr His Gin Gin He Gly Asn lie ser lie Thr Tyr Gly 

85 90 95 

Ala Thr Tyr Gin Pro Asn Gly Asn Ser Tyr Leu Thr Val Tyr Gly Trp 

100 105 110 

Thr Val Asp Pro Leu Val Glu Tyr Tyr lie val Asp Ser Trp Gly Ser 

115 120 125 

Trp Arg Pro Pro Gly Ala ser Pro Lys Gly Thr val Asn val Asp Gly 

130 135 140 

Gly Thr Tyr Asp lie Tyr Glu Thr Thr Arg Val Asn Gin Pro ser lie 
145 150 155 160 

Lys Gly Thr Ala Thr Phe Lys Gin Tyr Trp ser Val Arg Thr Ser Lys 

165 170 " 175 

Arg Thr Ser Gly Thr lie Ser Val ser Glu His Phe Lys Ala Trp Glu 

180 185 190 

Lys Leu Gly Met Thr Met Gly Lys Met Tyr Glu Val Ala Leu Thr Val 

195 200 205 

Glu Gly Tyr Gin Ser Ser Gly Ser Ala Asn Val Tyr ser His Thr Leu 

210 215 220 

Thr lie Gly Gly Gly Thr Thr Pro Pro Pro Thr Thr Gly Ttir Lys He 
225 230 235 240 

Glu Ala Glu Ser Met Thr Lys Ser Gly Gin Tyr Thr Gly Asn He ser 

245 250 255 

ser pro Phe Asn Gly val Ala Leu Tyr Ala Asn Asn Asp ser val Lys 

260 265 270 

Phe Thr His Asn Phe Thr Thr Gly Thr His Asn Phe ser Leu Arg Gly 

275 280 285 

Ala ser Asn Asn ser Asn Met Ala Arg val Asp Leu Lys lie Gly Gly 

290 295 300 

Gin Thr Lys Gly Thr Phe Tyr Phe Gly Gly ser Ser Pro Ala val Tyr 
305 310 315 320 

Thr Leu Asn Asn Val ser His Gly Thr Gly Asn Gin Glu val Glu Leu 

325 330 335 

val Val Thr Ala Asp Asn Gly Thr Trp Asp Ala Phe lie Asp Tyr Leu 
340 345 350 

Glu He His 
355 

<210> 165 
<211> 1047 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 165 

gtggggcgca ggagcgccgc cacggcattc atcggcctgg cagcgctgtg tgcctcggcc 60 
gccaacgcgc agacctgtct gagctcgagt cagaccggca ccaacaacgg cttctactat 120 
tcgttctgga ccgacggcgg tggctccgtg cagttctgcc tgcaatccgc cgggcgctac 180 
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acctccagct ggagcaatgt cggaaactgg gtcggtggca agggctggca gaccggcgcg 240 

cgccgcaaca tcaactattc cggcagcttc aatccctcgg gtaacgcgta cctggccgtc 300 

tatggctgga ccacgaatcc cctggtggag tactacatcg tcgacaactg gggtacctat 360 

cgtccaccgg gtgggcaggg attcatgggc acggttgtca gcgatggcgg cacctacgac 420 

gtctaccgca cgcaacgggt caacgcgccc tccattcagg gcaacgcgac cttctaccag 480 

tactggagcg ttcgccagtc gaagcgcacc ggtggaacca tctccaccgg caaccatttc 540 

gacggctggg cgacgttcgg catgaacctg ggaaccttca attaccagat cgtggcgacc 600 

gagggctacc agagcagcgg caattccgac atcacggtga gcgatggcgg cagcagctcc 660 

tcgtcctcca gcagcagcag ttcgtcgtcc tccagcagcg gcggtggcgg caccaagagc 720 

ttcacggtgc gcgcgcgcgg cacggccgga ggcgagtcga tcagcctgcg ggtcaacaac 780 

accaacgtgc agacctggtc gctgaccacc agctaccaga atctcacggc ctcgaccacg 840 

ctgaccggcg gcatcaccgt caactacacc aacgacagca gcggtcacga cgtacaggtg 900 

gactacatca tcgtgaacgg ccagacccgc cagtccgagg cgcagagcta caacaccgga 960 

ctctatgcca acgggcgctg cggtggtggt ggctacagcg agtggatgca ttgcaacggc 1020 

gccatcggct acggcaatac gccgtaa 1047 

<210> 166 
<211> 348 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(23) 

<400> 166 

Val Gly Arg Arg Ser Ala Ala Thr Ala Phe lie Gly Leu Ala Ala Leu 

1 5 10 15 

cys Ala ser Ala Ala Asn Ala Gin Thr Cys Leu Ser Ser ser Gin Thr 

20 25 30 

Gly Thr Asn Asn Gly phe Tyr Tyr ser Phe Trp Thr Asp Gly Gly Gly 

35 40 45 

Ser Val Gin Phe Cys Leu Gin Ser Ala Gly Arg Tyr Thr Ser ser Trp 

50 55 60 

Ser Asn Val Gly Asn Trp Val Gly Gly Lys Gly Trp Gin Thr Gly Ala 
65 70 75 80 

Arg Arg Asn lie Asn Tyr Ser Gly Ser Phe Asn Pro Ser Gly Asn Ala 

85 90 95 

Tyr Leu Ala val Tyr Gly Trp Thr Thr Asn Pro Leu Val Glu Tyr Tyr 

100 105 110 

lie Val Asp Asn Trp Gly Thr Tyr Arg Pro Pro Gly Gly Gin Gly Phe 

115 120 125 

Met Gly Thr Val val Ser Asp Gly Gly Thr Tyr Asp Val Tyr Arg Thr 

130 135 140 

Gin Arg Val Asn Ala Pro Ser lie Gin Gly Asn Ala Thr Phe Tyr Gin 
145 150 155 160 

Tyr Trp ser val Arg Gin ser Lys Arg Thr Gly Gly Thr lie ser Thr 

165 170 175 

Gly Asn His Phe Asp Gly Trp Ala Thr Phe Gly Met Asn Leu Gly Thr 

180 185 190 

Phe Asn Tyr Gin lie Val Ala Thr Glu Gly Tyr Gin Ser Ser Gly Asn 

195 200 205 

Ser Asp lie Thr Val Ser Asp Gly Gly Ser Ser Ser Ser Ser Ser Ser 

210 215 220 

ser ser ser ser ser ser ser ser ser Gly Gly Gly Gly Thr Lys ser 
225 230 235 240 

Phe Thr Val Arg Ala Arg Gly Thr Ala Gly Gly Glu ser lie ser Leu 

245 250 255 

Arg val Asn Asn Thr Asn Val Gin Thr Trp Ser Leu Thr Thr ser Tyr 

260 265 270 

Gin Asn Leu Thr Ala Ser Thr Thr Leu Thr Gly Gly lie Thr Val Asn 

275 280 285 

Tyr Thr Asn Asp ser ser Gly His Asp val Gin val Asp Tyr lie lie 

290 295 300 

Val Asn Gly Gin Thr Arg Gin Ser Glu Ala Gin ser Tyr Asn Thr Gly 
305 310 315 320 

Leu Tyr Ala Asn Gly Arg Cys Gly Gly Gly Gly Tyr Ser Glu Trp Met 
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325 330 335 

His Cys Asn Gly Ala lie Gly Tyr Gly Asn Thr Pro 
340 345 

<210> 167 
<211> 669 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 167 

gtgaagctga aaagactgtt caagatcgga ctgctgccgg ccgtattgtt gtttagtgca 60 

acgcagcagt taaccgcgca aaccatctgc agcaaccaga ccggcaccaa caacggctac 120 

ttctactcgt tctggaagga caccgggtcg gcgtgcatga cactgggttc cggcggcaac 180 

tacagcgtca actggaacct gggttccggg aacatggtct gcggcaaagg ctggagtacc 240 

ggatcttcaa gccgcagaat cggctacaac gccggcgtct gggcgccgaa cggcaatgcc 300 

tacctgactc tgtatgggtg gaccaggaac ccgctcatcg agtactacgt ggtcgacagt 360 

tggggaagct ggaggccgcc aggcggaacc tccgcgggca ccgtcaatag cgatggcggg 420 

acctacaacc tctatcggac gcagcgggtc aacgcgcctt Ccatcgacgg cacccggacg 480 

ttctatcagt actggagtgt ccggacctcg aagaggccca ccgggagcaa ccagaccatc 540 

accttcgcga accacgtgaa tgcgtggagg agcaaagggt ggaatctggg gagtcacgtc 600 

taccagataa tggcaacaga gggatatcaa agcagcggga attccaacct gacggtgtgg 660 

gcgcagtag ~ 669 

<210> 168 

<211> 222 

<212> PRT 

<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(36) 

<400> 168 

Val Lys Leu Lys Arg Leu Phe Lys lie Gly Leu Leu pro Ala Val Leu 
15 10 15 

Leu Phe ser Ala Thr Gin Gin Leu Thr Ala Gin Thr lie cys ser Asn 

20 25 30 

Gin Thr Gly Thr Asn Asn Gly Tyr Phe Tyr ser Phe Trp Lys Asp Thr 

35 40 45 

Gly Ser Ala Cys Met Thr Leu Gly Ser Gly Gly Asn Tyr ser Val Asn 

50 55 60 

Trp Asn Leu Gly Ser Gly Asn Met Val Cys Gly Lys Gly Trp ser Thr 
65 70 75 80 

Gly ser ser ser Arg Arg lie Gly Tyr Asn Ala Gly val Trp Ala Pro 

85 90 95 

Asn Gly Asn Ala Tyr Leu Thr Leu Tyr Gly Trp Thr Arg Asn Pro Leu 

100 105 110 

He Glu Tyr Tyr val val Asp Ser Trp Gly ser Trp Arg Pro Pro Gly 

115 120 125 

Gly Thr ser Ala Gly Thr val Asn Ser Asp Gly Gly Thr Tyr Asn Leu 

130 135 140 

Tyr Arg Thr Gin Arg Val Asn Ala pro ser lie Asp Gly Thr Arg Thr 
145 150 155 160 

Phe Tyr Gin Tyr Trp ser Val Arg Thr ser Lys Arg pro Thr Gly Ser 

165 170 175 

Asn Gin Thr He Thr Phe Ala Asn His Val Asn Ala Trp Arg Ser Lys 

180 . 185 190 

Gly Trp Asn Leu Gly ser His Val Tyr Gin lie Met Ala Thr Glu Gly 

195 200 205 

Tyr Gin ser ser Gly Asn ser Asn Leu Thr val Trp Ala Gin 
210 215 220 

<210> 169 
<211> 1041 
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<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 169 

atgattgtta gtttcaagag cgtgaaggca ctcgcgtgcc tcgccgtgct cggcattacc 60 

gccgcgcagg cgcaaacctg catcacttcc agccagaccg gtaccaacaa cggcaactac 120 

ttttccttct ggaaggacag cccgggtacc gtcaacttct gcatgtatgc caatgggcgc 180 

tacacctcca actggagcgg catcaacaac tgggtgggcg gcaagggctg gcagacgggc 240 

tccaaccgca cggtgaccta ctccggttcg ttcaattcgc ccggcaatgg ctatctcacc 300 

ttgtacggat ggaccacgaa tccattgatc gagtactaca tcgtcgacag ctggggcacc 360 

tatcgaccgc cgggcggcca gggcttcatg ggcaccgtca acagcgatgg cggcacctat 420 

gacatctacc gcacgcagcg cgtgaaccag ccttccatca tcggcaccgc cacgttctac 480 

cagtactgga gcgtgcggca gtcgaagcgc gtcggcggca cgatcaccac ggccaaccac 540 

ttcaacgcct gggccacgct gggcatgaac ctgggccagc acaactacca ggtcatggcc 600 

accgagggtt accagagcag tggcagctcc gacatcaccg tgaccgaggg cggcggctcc 660 

tcgtcgtcca gtggcggcgg cagcaccagc agtggcggtg gcggcagcaa gagcttcacc 720 

gtgcgtgcgc gcggcacggt cggcggcgaa aacatccagc tgcaggtcaa caaccagacg 780 

gtggcgagct ggaacctgac caccagcatg cagaactaca acgcctcgac cagcctgagt 840 

ggcggcatca ccgtcgtgta caccaatgac agcggcagcc gcgacgtgca ggtggactac 900 

atcgtcgtca acggccagac ccgccagtcc gaagcccaga gctacaacac cgggctctat 960 

gccaacggac gttgtggtgg cggctcgaac agcgagtgga tgcattgcaa cggcgcgatt 1020 

ggctacggca acacgcccta g 1041 

<210> 170 
<211> 346 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> CI)... (24) 

<400> 170 

Met lie val ser Phe Lys Ser Val Lys Ala Leu Ala Cys Leu Ala Val 

1 5 10 15 

Leu Gly lie Thr Ala Ala Gin Ala Gin Thr Cys lie Thr ser Ser Gin 

20 25 30 

Thr Gly Thr Asn Asn Gly Asn Tyr Phe Ser Phe Trp Lys Asp Ser Pro 

35 40 45 

Gly Thr val Asn Phe Cys Met Tyr Ala Asn Gly Arg Tyr Thr ser Asn 

50 55 60 

Trp Ser Gly He Asn Asn Trp val Gly Gly Lys Gly Trp Gin Thr Gly 
65 70 75 80 

Ser Asn Arg Thr val Thr Tyr ser Gly ser Phe Asn ser Pro Gly Asn 

85 90 95 

Gly Tyr Leu Thr Leu Tyr Gly Trp Thr Thr Asn Pro Leu lie Glu Tyr 

100 105 110 

Tyr He Val Asp Ser Trp Gly Thr Tyr Arg Pro Pro Gly Gly Gin Gly 

115 120 ~ 125 

Phe Met Gly Thr Val Asn Ser Asp Gly Gly Thr Tyr Asp He Tyr Arg 

130 135 140 

Thr Gin Arg val Asn Gin pro ser lie lie Gly Thr Ala Thr Phe Tyr 
145 150 155 160 

Gin Tyr Trp Ser Val Arg Gin Ser Lys Arg Val Gly Gly Thr lie Thr 

165 170 175 

Thr Ala Asn His Phe Asn Ala Trp Ala Thr Leu Gly Met Asn Leu Gly 

180 185 190 

Gin His Asn Tyr Gin val Met Ala Thr Glu Gly Tyr Gin ser Ser Gly 

195 200 205 

Ser ser Asp He Thr Val Thr Glu Gly Gly Gly ser ser ser ser ser 

210 215 220 

Gly Gly Gly Ser Thr Ser ser Gly Gly Gly Gly Ser Lys ser Phe Thr 
225 230 235 240 

Val Arg Ala Arg Gly Thr Val Gly Gly Glu Asn lie Gin Leu Gin Val 
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245 250 255 

Asn Asn Gin Thr Val Ala ser Trp Asn Leu Thr Thr Ser Met Gin Asn 

260 265 270 

Tyr Asn Ala ser Thr ser Leu Ser Gly Gly lie Thr val Val Tyr Thr 

275 280 285 

Asn Asp ser Gly Ser Arg Asp Val Gin Val Asp Tyr He Val Val Asn 

290 295 300 

Gly Gin Thr Arg Gin ser Glu Ala Gin ser Tyr Asn Thr Gly Leu Tyr 
305 ~ 310 315 320 

Ala Asn Gly Arg Cys Gly Gly Gly ser Asn Ser Glu Trp Met His Cys 

325 330 335 

Asn Gly Ala lie Gly Tyr Gly Asn Thr Pro 
340 345 

<210> 171 
<211> 678 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 171 

atggagttga aaaaaatatc cagaaaagga ctgccactag tattcttgtc cttgttgttg 60 

ttcagtgtaa cgcagcagtc aaacgcccaa accatctgca gcaatcaaac tggcacaaac 120 

aacggtttct tctattcgtt ttggaaggac accggatcag catgcatgac tttgggctct 180 

ggcggcaatt acgacgtaag ttggaatctg ggttctggga atatggttgt cggcaaaggc 240 

tggagtaccg gatcatcaac caggagagta ggctacaatg ccggcatctg gcagccgaac 300 

ggcaatgcat atttggctct ctatgggtgg acgagaaacc cacttataga atattacgtc 360 

gttgatagct ggggcacttt caggccgcct ggaggaacgt caataggctc cgtcaccact 420 

gatggtggta cataccaaat atatcggacc cagcgagtca acgcgccttc cattgacggc 480 

gccagaactt tttatcagta ctggagtgtc cggacctcga agagaccgac cgggagcaac 540 

caaaccatca cctttgcgaa tcacgttaac gcgtggagga atctaggttt gaatctgggg 600 

agtcatgttt accagataat ggccacagag ggatttcata gcagtgggag atctaaccta 660 

acggtgtggt cacagtaa ~ 678 

<210> 172 
<211> 225 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(29) 

<400> 172 

Met Glu Leu Lys Lys He ser Arg Lys Gly Leu Pro Leu val Phe Leu 

15 10 15 

Ser Leu Leu Leu Phe Ser Val Thr Gin Gin ser Asn Ala Gin Thr lie 

20 25 30 

Cys ser Asn Gin Thr Gly Thr Asn Asn Gly Phe Phe Tyr ser Phe Trp 

35 40 45 

Lys Asp Thr Gly Ser Ala cys Met Thr Leu Gly Ser Gly Gly Asn Tyr 

50 55 60 

Asp val ser Trp Asn Leu Gly ser Gly Asn Met val val Gly Lys Gly 
65 70 75 80 

Trp Ser Thr Gly ser ser Thr Arg Arg Val Gly Tyr Asn Ala Gly lie 

85 ~ 90 95 

Trp Gin Pro Asn Gly Asn Ala Tyr Leu Ala Leu Tyr Gly Trp Thr Arg 

100 105 110 

Asn Pro Leu lie Glu Tyr Tyr Val Val Asp Ser Trp Gly Thr Phe Arg 

115 120 125 

Pro Pro Gly Gly Thr ser He Gly Ser val Thr Thr Asp Gly Gly Thr 

130 135 140 

Tyr Gin lie Tyr Arg Thr Gin Arg Val Asn Ala Pro Ser lie Asp Gly 
145 150 155 160 

Ala Arg Thr Phe Tyr Gin Tyr Trp ser Val Arg Thr ser Lys Arg Pro 
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165 170 175 

Thr Gly ser Asn Gin Thr lie Thr Phe Ala Asn His Val Asn Ala Trp 

180 185 190 

Arg Asn Leu Gly Leu Asn Leu Gly Ser His val Tyr Gin lie Met Ala 

195 200 205 

Thr Glu Gly Phe His ser Ser Gly Arg Ser Asn Leu Thr Val Trp Ser 
210 215 220 

Gin 
225 

<210> 173 
<211> 1503 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 173 

ttgaaaaaac tcgcagctgc cttatcactt gcaattacct ttgccgtacc gacaatagta 60 

caagcacaag gtcccacatg gactaccagc acaatacaga aatacaacaa ctacgactat 120 

gaactctgga atgaaaacaa tcagggtacc gtttccatga agctcacagg agataacggt 180 

accgctgcca atgcggtagg cggaacgttt gagtctactt ggagtggtac aaagaatgtg 240 

cttttccgtt ccggcagaaa gtttaccggt acttcagggc aaagcgttga tggtggcggt 300 

gctggcaaaa ccgctagtgc ttacggcaat ataagcatta acttcgccgc tacgtggtct 360 

tccggtgacg atgtgaagat gcttggcgta tatggttggg cgttttacgc actgccaagt 420 

gtaccagaca aacaggaaaa cggcacttct actaattttt ccaatcaaat agaatactac 480 

atcattcaag accgcggcag ctataactcg gctacaggtg gcaccaactc aaagaaatac 540 

ggtgaggcta ccattgacgg cattgcttat gagttccgtg tatgtgatag aatagggcaa 600 

cctatgttaa ctggcaacgg gaattttaag cagtatttca gtgttcctaa aagcactata 660 

aaccaccgca ccagcggtac aatctctgtt tccaaacact ttgaagaatg ggaaaaagtc 720 

ggcatgaaaa tggacggtcc cttatacgaa gtagcgatga aagttgaatc ctattctggc 780 

aatgggaata gtaacggcaa tgctaaaatt acaaagaata ttttgaccat tggcggaaca 840 

accacaactc aaagcagttc aagcggaggt tcaacggttc cagatgaatg tggcgaatat 900 

aaaaagagtt tctgtggtgg cttgggatat ggaagcgtat attccaattt aaccgcaata 960 

ccctcaacgg gcgactgctt atacatcgga gattttgaag taatccagcc agctttgaat 1020 

tcaaccgttg ccataaacgg tgtggaaaat acctgcggaa gcgagtggtc agattgccct 1080 

tacaatgata aacccgattc aaaaaaagat ggcggctatt atgtttatgt gaaaacaggc 1140 

tcaattaaca attatgagaa taacggttgg caaaacattg tagctaaagc aaaaccggct 1200 

tgcacaccac cttctagcag ttccggtgct gcaccaggtt cttcttcttc agacgaagaa 1260 

gacccagagc caattttgaa aaatcgcatt cctataactc atttttccct tcaaacgctt 1320 

agcgataaag ccttgcgcat agaagtaaat gctccaacta ttgtggacat ttttgacctg 1380 

agagggaata aggttaaaag tttgaatgtt tacggttcgc aaagggttaa attatccctg 1440 

ccgagcgggg tgtattttgc caaagtgcgc gggatgaaaa gcgttagatt tgtgttgagg 1500 

taa " 1503 

<210> 174 
<211> 500 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(22) 

<400> 174 

Leu Lys Lys Leu Ala Ala Ala Leu Ser Leu Ala He Thr Phe Ala Val 
15 10 15 

Pro Thr lie Val Gin Ala Gin Gly Pro Thr Trp Thr Thr ser Thr lie 

20 25 30 

Gin Lys Tyr Asn Asn Tyr Asp Tyr Glu Leu Trp Asn Glu Asn Asn Gin 

35 40 45 

Gly Thr Val ser Met Lys Leu Thr Gly Asp Asn Gly Thr Ala Ala Asn 

50_ 55 60 

Ala val Gly Gly Thr phe Glu Ser Thr Trp ser Gly Thr Lys Asn Val 
65 70 75 80 

Leu Phe Arg Ser Gly Arg Lys Phe Thr Gly Thr ser Gly Gin ser Val 
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85 90 95 

Asp Gly Gly Gly Ala Gly Lys Thr Ala ser Ala Tyr Gly Asn lie ser 

100 105 110 

He Asn Phe Ala Ala Thr Trp Ser Ser Gly Asp Asp Val Lys Met Leu 

115 120 125 

Gly Val Tyr Gly Trp Ala Phe Tyr Ala Leu Pro ser val Pro Asp Lys 

130 135 140 

Gin Glu Asn Gly Thr ser Thr Asn Phe ser Asn Gin He Glu Tyr Tyr 
145 150 155 160 

lie lie Gin Asp Arg Gly ser Tyr Asn Ser Ala Thr Gly Gly Thr Asn 

165 170 175 

Ser Lys Lys Tyr Gly Glu Ala Thr lie Asp Gly lie Ala Tyr Glu Phe 

180 185 190 

Arg Val Cys Asp Arg He Gly Gin Pro Met Leu Thr Gly Asn Gly Asn 

195 ~ 200 205 

Phe Lys Gin Tyr Phe ser val Pro Lys Ser Thr lie Asn His Arg Thr 

210 215 220 

Ser Gly Thr lie Ser Val ser Lys His Phe Glu Glu Trp Glu Lys Val 
225 230 235 240 

Gly Met Lys Met Asp Gly Pro Leu Tyr Glu Val Ala Met Lys Val Glu 

245 250 255 

Ser Tyr ser Gly Asn Gly Asn Ser Asn Gly Asn Ala Lys He Thr Lys 

260 265 270 

Asn lie Leu Thr lie Gly Gly Thr Thr Thr Thr Gin Ser ser ser ser 

275 280 285 

Gly Gly ser Thr Val Pro Asp Glu Cys Gly Glu Tyr Lys Lys Ser Phe 

290 295 300 

cys Gly Gly Leu Gly Tyr Gly Ser Val Tyr ser Asn Leu Thr Ala lie 
305 310 315 320 

Pro ser Thr Gly Asp Cys Leu Tyr lie Gly Asp Phe Glu Val lie Gin 

325 330 335 

Pro Ala Leu Asn Ser Thr val Ala lie Asn Gly val Glu Asn Thr cys 

340 345 350 

Gly Ser Glu Trp Ser Asp cys Pro Tyr Asn Asp Lys pro Asp ser Lys 

355 360 365 

Lys Asp Gly Gly Tyr Tyr Val Tyr Val Lys Thr Gly Ser lie Asn Asn 

370 375 380 

Tyr Glu Asn Asn Gly Trp Gin Asn He Val Ala Lys Ala Lys Pro Ala 
385 390 395 400 

Cys Thr Pro Pro ser Ser ser ser Gly Ala Ala Pro Gly ser Ser ser 

405 410 415 

Ser Asp Glu Glu Asp Pro Glu Pro lie Leu Lys Asn Arg lie Pro lie 

420 425 430 

Thr His Phe Ser Leu Gin Thr Leu Ser Asp Lys Ala Leu Arg lie Glu 

435 440 445 

val Asn Ala Pro Thr lie val Asp lie Phe Asp Leu Arg Gly Asn Lys 

450 455 460 

Val Lys Ser Leu Asn val Tyr Gly Ser Gin Arg val Lys Leu Ser Leu 
465 470 475 480 

Pro Ser Gly Val Tyr Phe Ala Lys Val Arg Gly Met Lys ser val Arg 
485 490 495 

Phe val Leu Arg 
500 

<210> 175 
<211> 1053 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 175 

atgaagtcca ttcgcagccg cagcctcgcc accgccgtcc tggctggcgc cctcggcgtc 60 
gcagccgcag gcgcgcaggc gcagacgctc aacaacaatt ccaccggcac gcacgacggc 120 
tactactaca cgttctggaa ggactcgggc agcgcctcga tgaccctcca tccgggcgga 180 
cgctacagct cccagtggac cagcaacacc aacaactggg tcggcgggaa aggctggaat 240 
cccggtggcc cgcgcgtggt caactactcg ggctactacg gggtcaacaa cagccagaac 300 
tcctacctgg cgctgtacgg ctggacccgc aatccgctgg tcgagtacta cgtgatcgag 360 
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agctacggct cctacaaccc ggccagttgc gccggcgggg tggactacgg cagcttccag 420 

agcgatggcg ccacctacaa cgtacgtcgc tgcctgcgcc agaacgcgcc gtcgatcgaa 480 

ggcaacaaca gcaccttcta ccagtacttc agcgtgcgca atcccaagaa gggattcggc 540 

aacatctccg gcacgatcac cgtcgccaac cacttcaact actgggccag ccgcggcctc 600 

aacctcggca accacgacta catggtgttc gccaccgagg gctaccagag ccagggcagc 660 

agcgacatca ccgtgagttc gggtaccggc ggcggcggtg gcggcggcaa cacgggcagc 720 

aagaccatcg tggtgcgcgc gcgcggcacc gccggcggag agaacatctc gctcaaggtc 780 

aacaacgcca ccatcgccag ctggacgctc accaccagca tggccaacta cacggccacc 840 

acctcggcat cgggcggctc gctggtggag ttcaccaacg acggcggcaa ccgcgacgtg 900 

caggtggact acctcagcgt caatggcgcc gtccgccagg ccgaggacca gacctacaac 960 

accggcgtgt accagaacgg ccagtgcggc ggcggcaacg gccgcagcga atggctgcac 1020 

tgcaacggtg ccatcggctt cggaaatctc tga 1053 

<210> 176 
<211> 350 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (l)-..(27) 

<400> 176 

Met Lys ser lie Arg Ser Arg Ser Leu Ala Thr Ala Val Leu Ala Gly 

1 5 10 15 

Ala Leu Gly val Ala Ala Ala Gly Ala Gin Ala Gin Thr Leu Asn Asn 

20 25 30 

Asn ser Thr Gly Thr His Asp Gly Tyr Tyr Tyr Thr Phe Trp Lys Asp 

35 40 45 

Ser Gly Ser Ala Ser Met Thr Leu His Pro Gly Gly Arg Tyr ser Ser 

50 55 60 

Gin Trp Thr ser Asn Thr Asn Asn Trp Val Gly Gly Lys Gly Trp Asn 
65 70 75 80 

Pro Gly Gly Pro Arg val val Asn Tyr ser Gly Tyr Tyr Gly Val Asn 

85 9Q 95 

Asn Ser Gin Asn ser Tyr Leu Ala Leu Tyr Gly Trp Thr Arg Asn Pro 

100 105 110 

Leu Val Glu Tyr Tyr Val lie Glu Ser Tyr Gly Ser Tyr Asn Pro Ala 

115 120 125 

Ser Cys Ala Gly Gly val Asp Tyr Gly Ser Phe Gin ser Asp Gly Ala 

130 135 140 

Thr Tyr Asn val Arg Arg cys Leu Arg Gin Asn Ala Pro Ser lie Glu 
145 150 155 160 

Gly Asn Asn Ser Thr Phe Tyr Gin Tyr Phe ser Val Arg Asn Pro Lys 

165 170 175 

Lys Gly Phe Gly Asn lie Ser Gly Thr lie Thr Val Ala Asn His Phe 

180 185 190 

Asn Tyr Trp Ala ser Arg Gly Leu Asn Leu Gly Asn His Asp Tyr Met 

, 195 200 205 

Val Phe Ala Thr Glu Gly Tyr Gin Ser Gin Gly ser ser asd lie Thr 

210 215 220 

Val ser ser Gly Thr Gly Gly Gly Gly Gly Gly Gly Asn Thr Gly ser 
225 230 235 240 

Lys Thr lie Val val Arg Ala Arg Gly Thr Ala Gly Gly Glu Asn lie 

245 250 255 

ser Leu Lys val Asn Asn Ala Thr lie Ala Ser Trp Thr Leu Thr Thr 

260 265 270 

Ser Met Ala Asn Tyr Thr Ala Thr Thr Ser Ala Ser Gly Gly Ser Leu 

275 * 280 285 

Val Glu Phe Thr Asn Asp Gly Gly Asn Arg Asp val Gin val Asp Tyr 

290 295 300 

Leu ser val Asn Gly Ala Val Arg Gin Ala Glu Asp Gin Thr Tyr Asn 
305 310 315 320 

Thr Gly Val Tyr Gin Asn Gly Gin cys Gly Gly Gly Asn Gly Arg ser 

325 330 335 

Glu Trp Leu His cys Asn Gly Ala lie Gly Phe Gly Asn Leu 
340 345 350 
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<210> 177 
<211> 1299 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 177 

atgaaattgt tgaaaacgca caggcgtgcg attgctgccg cagcactagc ggtggcgact 60 

gttccaatcg ctcatgcgca aacgcttagc tcaaatgcca ctggaaccca gaatggttac 120 

tactattcgt tttggaagga ttccggtaac gccaccatga cactcggtgc cggtggaaac 180 

tattcttcat cctggaacag cagcactaac aactgggttg gcggtaaagg ctggatgccg 240 

ggtactcggc gcacagtcac ctattcgggc agttatagcg cgagtggaac cagctacctc 300 

gcactttacg gctggactcg aaacccgctg atcgaatatt acattgtcga aaactgggtc 360 

aattacaatc ctgcgtccgg cgcaacgaat tatgggactg tcaatattga cggcagcacc 420 

taccagctgg gccgcagcca acgggttaat cagccatcta ttgaaggcac ggccacgttc 480 

taccaatact ggagtgtgcg ccaaaacaag cgcaccagcg gaacgattaa tattggagcg 540 

catttcgatg catgggctgc tgtgggcttg aacctgggga ctcacgatta tcagattatg 600 

gcgaccgagg gctaccagag cagcggccag tccaatatca cggtgagcga aggcagtagc 660 

ggcagcacga cttcgagcac atccagctcc agctcaagta cgagttccag tagttcttcc 720 

agcagttctt ccggcggcgg cacaggaagt tgtgccggag tgaatgtgta ccccaattgg 780 

accgcacgcg actggtctgg cggcgcatac aatcacgcca atgccggtga ccaaatggtc 840 

tatcaaaaca atttgtaccg ggcaaactgg tacaccaact ccacgcctgg aagcgatgcc 900 

tcctggacca gtctcgggtc ctgtagcggc ggcggtagca ccagttcaac aacgagctcc 960 

tccagttcct cttccacctc ggcgtcgagc agctccaact catccagcag cagttcaagc 1020 

agctccagca gcggtggctg tcgggaaatg tgtaactggt acggacaggg tatgtatcct 1080 

ctgtgtcaga acaccagcgg ttggggatgg gaaaataacc agaactgtat cggtcgccaa 1140 

acctgtcaaa gtcagaacgg cggctccggg ggtgtggtga acagctgtgg taccagcagc 1200 

tcttcgtcca gtagcacctc ctcatcgagc agttcaagtt cgtcgagtgg caccacgtca 1260 

tcgtcctccg gaattcctgc agcccggggg atccactag 1299 

<210> 178 
<211> 432 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(26) 

<400> 178 

Met Lys Leu Leu Lys Thr His Arg Arg Ala lie Ala Ala Ala Ala Leu 

1 5 10 15 

Ala val Ala Thr val Pro lie Ala His Ala Gin Thr Leu ser ser Asn 

• 20 25 30 

Ala Thr Gly Thr Gin Asn Gly Tyr Tyr Tyr Ser Phe Trp Lys Asp Ser 

35 40 45 

Gly Asn Ala Thr Met Thr Leu Gly Ala Gly Gly Asn Tyr Ser ser Ser 

50 55 60 

Trp Asn Ser Ser Thr Asn Asn Trp val Gly Gly Lys Gly Trp Met Pro 
65 70 75 80 

Gly Thr Arg Arg Thr Val Thr Tyr Ser Gly Ser Tyr ser Ala Ser Gly 

85 90 95 

Thr ser Tyr Leu Ala Leu Tyr Gly Trp Thr Arg Asn Pro Leu lie Glu 

100 105 110 

Tyr Tyr lie val Glu Asn Trp val Asn Tyr Asn Pro Ala ser Gly Ala 

115 120 125 

Thr Asn Tyr Gly Thr Val Asn lie Asp Gly ser Thr Tyr Gin Leu Gly 

130 135 140 

Arg Ser Gin Arg val Asn Gin Pro ser lie Glu Gly Thr Ala Thr Phe 
145 150 155 160 

Tyr Gin Tyr Trp Ser Val Arg Gin Asn Lys Arg Thr ser Gly Thr lie 

165 170 175 

Asn lie Gly Ala His Phe Asp Ala Trp Ala Ala val Gly Leu Asn Leu 
180 185 190 
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<210> 179 
<211> 852 
<212> DNA 
<213> Unknown 



<220> 

<223> Obtained from an environmental sample 
<400> 179 

atgaagaatt ggcegggaac gggtattata ttattattgg cgggcggcct tttggcggct 60 

tgtttgacgg geaaaeggea agaggggcaa aaagtggatc eggatactea aaacgagaaa 120 

ttgacaggcg ggaccgtgtt tacagctaac agcaggggga acaggcccct ggaaggttcg 180 

ccttatggtt acgaaatgtg gaegcaggge gggaataata acaagcttgt ttggttcggg 240 

ceggatcagg ggggaggggc ggctttcagg gcagaatgga aegagcegga tgattttttg 300 

ggacgactgg gtttctggtg gggaaaegge gggcaattta aagaatataa aaatatgtac 360 

geggatttea attacacaag gteggggege ggcaccggcg gcagttattc ttatataggc 420 

atttaegget gggcgagaaa cccgaacgcc gcgaacgagg aagacaggtt aatagaatac 480 

tatattgtgg acgactggtt egggaatcaa tggcagtccg acgacacccc cattaccaca 540 

agaacaacag gaggctccgt attgggtacc attatagegg aeggegegtt ttacaacgtc 600 

gtcaggaatg tgagaaccca aaagectteg atagaeggea tcaaaacatt cgcccaatac 660 

ttcagcatac gccaaacacc gcgccaaagc gggacaatct ccatcaccga acatttcaaa 720 

caatgggaaa geatgggect gaagctcgga aatatgtacg aggcaaaatt cctggtagaa 780 

gccggcggcg gcaccggctg gctggagttt aegtatctta aactgacgea ggaagaaaaa 840 

aaaagaaatt ag " 852 

<210> 180 
<211> 283 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1) . . . (19) 

<400> 180 

Met Lys Asn Trp Pro Gly Thr Gly He lie Leu Leu Leu Ala Gly Gly 
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15 10 15 

Leu Leu Ala Ala Cys Leu Thr Gly Lys Arg Gin Glu Gly Gin Lys Val 

20 25 30 

Asp Pro Asp Thr Gin Asn Glu Lys Leu Thr Gly Gly Thr Val Phe Thr 

35 40 45 

Ala Asn Ser Arg Gly Asn Arg pro Leu Glu Gly ser Pro Tyr Gly Tyr 

50 55 60 

Glu Met Trp Thr Gin Gly Gly Asn Asn Asn Lys Leu Val Trp phe Gly 
65 70 75 80 

Pro Asp Gin Gly Gly Gly Ala Ala Phe Arg Ala Glu Trp Asn Glu Pro 

85 90 95 

Asp Asp Phe Leu Gly Arg Leu Gly Phe Trp Trp Gly Asn Gly Gly Gin 

100 105 110 

Phe Lys Glu Tyr Lys Asn Met Tyr Ala Asp Phe Asn Tyr Thr Arg Ser 

115 120 125 

Gly Arg Gly Thr Gly Gly ser Tyr Ser Tyr lie Gly lie Tyr Gly Trp 

130 135 140 

Ala Arg Asn pro Asn Ala Ala Asn Glu Glu Asp Arg Leu lie Glu Tyr 
145 150 155 160 

Tyr lie val Asp Asp Trp Phe Gly Asn Gin Trp Gin Ser Asp Asp Thr 

165 170 175 

Pro lie Thr Thr Arg Thr Thr Gly Gly Ser Val Leu Gly Thr lie He 

180 185 190 

Ala Asp Gly Ala Phe Tyr Asn Val val Arg Asn Val Arg Thr Gin Lys 

195 200 205 

Pro ser lie Asp Gly lie Lys Thr Phe Ala Gin Tyr Phe Ser lie Arg 

210 215 220 

Gin Thr Pro Arg Gin ser Gly Thr lie ser lie Thr Glu His Phe Lys 
225 230 235 240 

Gin Trp Glu Ser Met Gly Leu Lys Leu Gly Asn Met Tyr Glu Ala Lys 

245 250 255 

Phe Leu val Glu Ala Gly Gly Gly Thr Gly Trp Leu Glu Phe Thr Tyr 

260 265 270 

Leu Lys Leu Thr Gin Glu Glu Lys Lys Arg Asn 
275 280 

<210> 181 
<211> 1077 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 181 

atgaacttca gtctcaggaa ggctgcagcg gcgctggctt gcgtcgcggg cctgtatgca 60 

tcatcggcgg gcgctcagac ctgcctgacc aacaaccaga ccggcaacaa cggcgggtac 120 

tactactcgt tctggaagga cagcggcaac gtcaccttct gcctgcagtc cggcgggcga 180 

tacacgtccc agtggagcaa cgtcaacaac tgggtgggcg gcaagggctg gaacccgggt 240 

gggcgacgca ccgtcaccta ttccggcacc tacaacccca atggcaattc gtacctgacc 300 

ctgtacggct ggaccacgaa tccactggtc gagtactaca tcgtcgacag ctggggttcc 360 

tggcgcccac cgggctcggg atacatgggc acggtcacca gcgatggcgg cacctacgac 420 

atctatcgca cgcagcgtgt gaaccagcct tccatcatcg gcaccgcgac gttctaccaa 480 

tactggagcg tgcggcaatc gaagcgcgtg ggtggcacca tcacctcggg caatcacttc 540 

gatgcctggg cctcgctggg catgaacctc ggcacgcaca actacatggt gatggccacc 600 

gagggctacc agagcagcgg cagctcggac atcacggtgg gcagcggcag ttcgtcgtcg 660 

agcagcagct cgtccagcag tagcagctcg tcgtccagta gcagcagcag ttcttcgtcc 720 

agcagcagcg gtggcggcgg caccaagagc ttcaccgtgc gcgcacgcgg cacggcgggt 780 

ggcgagtcca tcaccttgcg ggtgaacaac cagaacgtgc agacctggac gctgggcacc 840 

agcatgcaga actacacggc gtccacctcg ctgagcggcg gcatcacggt ggccttcacc 900 

aacgacggcg gcaaccgcga cgtccaggtg gattacatca tcgtgaatgg ccagacgcgc 960 

cagtccgagg cgcagaccta caacaccggc ctgtatgcca atggccgctg cggtggtggc 1020 

tctaacagcg agtggatgca ctgcaacggc gccatcggct acggcaacac gccctag 1077 

<210> 182 
<211> 358 
<212> PRT 
<213> Unknown 
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<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(25) 

<400> 182 

Met Asn Phe ser Leu Arg Lys Ala Ala Ala Ala Leu Ala cys val Ala 

1 5 10 15 

Gly Leu Tyr Ala Ser Ser Ala Gly Ala Gin Thr Cys Leu Thr Asn Asn 

20 25 30 

Gin Thr Gly Asn Asn Gly Gly Tyr Tyr Tyr Ser Phe Trp Lys Asp Ser 

35 40 45 

Gly Asn val Thr Phe cys Leu Gin Ser Gly Gly Arg Tyr Thr Ser Gin 

50 55 60 

Trp Ser Asn val Asn Asn Trp Val Gly Gly Lys Gly Trp Asn Pro Gly 
65 70 75 80 

Gly Arg Arg Thr Val Thr Tyr Ser Gly Thr Tyr Asn Pro Asn Gly Asn 

85 90 95 

Ser Tyr Leu Thr Leu Tyr Gly Trp Thr Thr Asn Pro Leu val Glu Tyr 

n 100 105 110 

Tyr lie Val Asp Ser Trp Gly ser Trp Arg pro Pro Gly ser Gly Tyr 

115 120 ~ 125 

Met Gly Thr Val Thr ser Asp Gly Gly Thr Tyr Asp lie Tyr Arg Thr 

130 135 140 

Gin Arg val Asn Gin Pro Ser lie lie Gly Thr Ala Thr Phe Tyr Gin 
145 150 155 160 

Tyr Trp ser Val Arg Gin Ser Lys Arg Val Gly Gly Thr He Thr Ser 

165 170 175 

Gly Asn His Phe Asp Ala Trp Ala ser Leu Gly Met Asn Leu Gly Thr 

180 185 190 

His Asn Tyr Met Val Met Ala Thr Glu Gly Tyr Gin Ser ser Gly ser 

195 200 205 

Ser Asp He Thr Val Gly ser Gly Ser Ser Ser Ser Ser ser ser ser 

210 215 220 

ser ser ser Ser ser ser ser ser ser ser Ser Ser Ser Ser ser Ser 
225 230 235 240 

ser Ser ser Gly Gly Gly Gly Thr Lys ser Phe Thr Val Arg Ala Arg 

245 250 255 

Gly Thr Ala Gly Gly Glu ser lie Thr Leu Arg Val Asn Asn Gin Asn 

, , 2 60 265 270 

val Gin Thr Trp Thr Leu Gly Thr Ser Met Gin Asn Tyr Thr Ala Ser 

275 280 285 

Thr ser Leu ser Gly Gly He Thr Val Ala Phe Thr Asn Asp Gly Gly 

290 295 300 

Asn Arg Asp val Gin val Asp Tyr lie He val Asn Gly Gin Thr Arg 
305 310 315 320 

Gin ser Glu Ala Gin Thr Tyr Asn Thr Gly Leu Tyr Ala Asn Gly Arq 
, , 325 330 y 335 " 

cys Gly Gly Gly Ser Asn Ser Glu Trp Met His Cys Asn Gly Ala He 

, 340 e 345 350 

Gly Tyr Gly Asn Thr Pro 
355 

<210> 183 

<211> 1083 

<212> DNA 

<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 183 

atgatcgaag gtctcaggag acctgccttc agtggcagga gcatcgtcaa ggcattgctc 60 
tgcgtcgcgg ccctgtatgc atcggcggcg caggcgcaga cctgtctcag ttcgagccag 120 
accggcacca acaacggctt ctactattcg ttctggaagg acagcccggg cagcgtgcag 180 
ttctgcatgt attccggcgg ccgctacaca tccaactgga gcggcatcaa caactgggtc 240 
ggcggcaagg ggtggcagac cggcgcctcg cgcgtggtca gctactcggg cacgttcaat 300 
tcaccgggca acggctacct ggcgctgtac ggctggacca ccaatccact ggtcgagtac 360 
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tacatcgtcg acaactgggg cacctatcgc ccgccgggcg gcacgggatt ccagggcacg 420 

gtgaccagtg acggcggtac ctacgacatc taccggaccg agcgcaccaa cgcgccctgc 480 

atcaccggca acaactgcaa cttctcgcag ttctggagcg tgcggcagtc gaagcgcacc 540 

ggcggcacca tcaccaccgg caatcacttc agcgcctggg cgtcgcacgg catgaacatg 600 

ggccagcaca actaccagat catggccacc gagggttacc agagcaacgg cagctcggac 660 

atcacggtct cggaaggcag cagttcgtcg agcagcagca gttcgtcctc ttcgtcgagc 720 

agcagctcgt cgagcggcgg cggcggcagc aagagcttca cggtgcgcgc ccgcggcacc 780 

gcgggtggcg agcagatccg gctgcgcgtg aacaatacga ccgtgcagac ctggacgctg 840 

aacaccacga tgacgaacta caccgcttcg accacgctga gcggcggcat cacggtggag 900 

tacttcaacg acagcaccaa tcacgacgtg caggtggact acatcatcgt gaacggcgcg 960 

acgcgccagt ccgaagcgca gagctacaac accggcctgt atgccaacgg ccgttgcggt 1020 

ggcggttcca acagcgaatg gatgcattgc aatggcgcca tcggctacgg caacactcca 1080 

taa 1083 

<210> 184 
<211> 360 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (32) 

<400> 184 

Met lie Glu Gly Leu Arg Arg Pro Ala phe Ser Gly Arg ser lie Val 

1 5 10 15 

Lys Ala Leu Leu Cys Val Ala Ala Leu Tyr Ala Ser Ala Ala Gin Ala 

20 25 30 

Gin Thr cys Leu Ser Ser ser Gin Thr Gly Thr Asn Asn Gly Phe Tyr 

35 40 45 

Tyr Ser Phe Trp Lys Asp ser Pro Gly ser Val Gin Phe Cys Met Tyr 

50 55 60 

Ser Gly Gly Arg Tyr Thr Ser Asn Trp Ser Gly lie Asn Asn Trp Val 
65 70 75 80 

Gly Gly Lys Gly Trp Gin Thr Gly Ala Ser Arg Val Val Ser Tyr Ser 

85 90 95 

Gly Ttir Phe Asn ser Pro Gly Asn Gly Tyr Leu Ala Leu Tyr Gly Trp 

100 105 110 

Thr Thr Asn Pro Leu val Glu Tyr Tyr lie Val Asp Asn Trp Gly Thr 

115 120 125 

Tyr Arg Pro Pro Gly Gly Thr Gly Phe Gin Gly Thr Val Thr ser Asp 

130 135 140 

Gly Gly Thr Tyr Asp He Tyr Arg Thr Glu Arg Thr Asn Ala pro Cys 
145 150 155 160 

He Thr Gly Asn Asn cys Asn Phe Ser Gin Phe Trp Ser Val Arg Gin 

165 170 175 

Ser Lys Arg Thr Gly Gly Thr He Thr Thr Gly Asn His Phe ser Ala 

180 185 190 

Trp Ala Ser His Gly Met Asn Met Gly Gin His Asn Tyr Gin lie Met 

195 200 205 

Ala Thr Glu Gly Tyr Gin Ser Asn Gly ser Ser Asp He Thr Val ser 

210 215 220 

Glu Gly ser ser ser ser ser ser ser ser ser ser Ser ser ser ser 
225 230 235 240 

Ser Ser ser Ser ser Gly Gly Gly Gly ser Lys ser Phe Thr Val Arg 

245 250 255 

Ala Arg Gly Thr Ala Gly Gly Glu Gin lie Arg Leu Arg val Asn Asn 

260 265 270 

Thr Thr Val Gin Thr Trp Thr Leu Asn Thr Thr Met Thr Asn Tyr Thr 

275 280 285 

Ala ser Thr Thr Leu ser Gly Gly lie Thr Val Glu Tyr Phe Asn Asp 

290 295 300 

Ser Thr Asn His Asp val Gin Val Asp Tyr lie He val Asn Gly Ala 
305 310 315 320 

Thr Arg Gin Ser Glu Ala Gin Ser Tyr Asn Thr Gly Leu Tyr Ala Asn 

325 330 335 

Gly Arg cys Gly Gly Gly ser Asn ser Glu Trp Met His Cys Asn Gly 

Page 142 



WO 03/106654 



PCT/US03/19153 



340 345 350 

Ala lie Gly Tyr Gly Asn Thr Pro 
355 360 

<210> 185 
<211> 684 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 185 

atgaatttga aaagattgag gctgttgttt gtgatgtgta ttggatttgt gctgacactg 60 

acggctgtgc cagctcatgc ggaaacgatt tatgataata ggatagggac acacagcgga 120 

tacgattttg aattatggaa ggattacgga aatacctcga tgacactcaa taacggcggg 180 

gcatttagtg caagctggaa caatattgga aatgccttat ttcgaaaagg aaagaagttt 240 

gattccacta aaactcatca tcaacttggc aacatctcca tcaactacaa cgcagccttt 300 

aacccgggcg ggaattccta tttatgtgtc tatggctgga cacaatctcc attagctgaa 360 

tactacattg ttgagtcatg gggcacatat cgtccaacag gaacgtataa aggatcattt 420 

tatgccgatg gaggcacata tgacatatat gaaacgctcc gtgtcaatca gccttctatc 480 

attggagacg ctaccttcaa acaatattgg agtgtacgtc aaacaaaacg cacaagcgga 540 

actgtttccg tcagtgagca ttttaaaaaa tgggaaagct taggcatgcc aatgggaaaa 600 

atgtatgaaa cagcattaac tgtagaaggc taccgaagca acggaagtgc gaatgtcatg 660 

acgaatcagc tgatgattcg ataa 684 

<210> 186 

<211> 227 

<212> PRT 

<213> unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(27) 

<400> 186 

Met Asn Leu Lys Arg Leu Arg Leu Leu Phe Val Met Cys lie Gly Phe 

1 5 10 15 

val Leu Thr Leu Thr Ala val Pro Ala His Ala Glu Thr lie Tyr Asp 

20 25 30 

Asn Arg lie Gly Thr His ser Gly Tyr Asp Phe Glu Leu Trp Lys Asp 

35 40 45 

Tyr Gly Asn Thr Ser Met Thr Leu Asn Asn Gly Gly Ala Phe Ser Ala 

50 55 60 

ser Trp Asn Asn lie Gly Asn Ala Leu Phe Arg Lys Gly Lys Lys Phe 
65 70 75 80 

Asp ser Thr Lys Thr His His Gin Leu Gly Asn lie ser lie Asn Tyr 

85 90 95 

Asn Ala Ala Phe Asn Pro Gly Gly Asn Ser Tyr Leu Cys Val Tyr Gly 

100 105 110 

Trp Thr Gin Ser Pro Leu Ala Glu Tyr Tyr lie Val Glu Ser Trp Gly 

115 120 125 

Thr Tyr Arg Pro Thr Gly Thr Tyr Lys Gly Ser Phe Tyr Ala Asp Gly 

130 135 140 

Gly Thr Tyr Asp lie Tyr Glu Thr Leu Arg val Asn Gin Pro ser lie 
145 150 155 160 

lie Gly Asp Ala Thr Phe Lys Gin Tyr Trp Ser Val Arg Gin Thr Lys 

165 170 175 

Arg Thr Ser Gly Thr val ser Val ser Glu His Phe Lys Lys Trp Glu 

180 185 190 

Ser Leu Gly Met Pro Met Gly Lys Met Tyr Glu Thr Ala Leu Thr Val 

195 200 205 

Glu Gly Tyr Arg ser Asn Gly ser Ala Asn Val Met Thr Asn Gin Leu 

210 215 220 

Met lie Arg 
225 
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<210> 187 
<211> 642 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 187 

atgtttaagt ttaaaaagaa tttcttagtt ggattatcgg cagctttaat gagtattagc 60 

ttgttttcgg caaccgcctc tgcagctagc acagactact ggcaaaattg gactgatggg 120 

ggcggtatag taaacgctgt caatgggtct ggcgggaatt acagtgttaa ttggtctaat 180 

accggaaatt ttgttgttgg taaaggttgg actacaggtt cgccatttag gacgataaac 240 

tataatgccg gagtttgggc gccgaatggc aatggatatt taactttata tggttggacg 300 

agatcacctc tcatagaata ttatgtagtg gattcatggg gtacttatag acctactgga 360 

acgtataaag gtactgtaaa aagtgatggg ggtacatatg acatatatac aactacacgt 420 

tataacgcac cttccattga tggcgatcgc actactttta cgcagtactg gagtgttcgc 480 

cagtcgaaga gaccaaccgg aagcaacgct acaatcactt tcagcaatca tgtgaacgca 540 

tggaagagcc atggaatgaa tctgggcagt aattgggctt accaagtcat ggcgacagaa 600 

ggatatcaaa gtagtggaag ttctaacgta acagtgtggt aa 642 

<210> 188 
<211> 213 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(28) 

<400> 188 

Met Phe Lys Phe Lys Lys Asn Phe Leu Val Gly Leu Ser Ala Ala Leu 

15 10 15 

Met ser lie Ser Leu Phe Ser Ala Thr Ala Ser Ala Ala ser Thr Asp 

20 25 30 

Tyr Trp Gin Asn Trp Thr Asp Gly Gly Gly lie Val Asn Ala val Asn 

35 40 45 

Gly ser Gly Gly Asn Tyr ser Val Asn Trp Ser Asn Thr Gly Asn Phe 

50 55 60 

Val Val Gly Lys Gly Trp Thr Thr Gly Ser Pro Phe Arg Thr lie Asn 
65 70 75 80 

Tyr Asn Ala Gly Val Trp Ala Pro Asn Gly Asn Gly Tyr Leu Thr Leu 

85 90 95 

Tyr Gly Trp Thr Arg ser Pro Leu lie Glu Tyr Tyr val val Asp Ser 

100 105 110 

Trp Gly Thr Tyr Arg pro Thr Gly Thr Tyr Lys Gly Thr Val Lys Ser 

115 120 125 

Asp Gly Gly Thr Tyr Asp lie Tyr Thr Thr Thr Arg Tyr Asn Ala Pro 

130 135 140 

Ser He Asp Gly Asp Arg Thr Thr Phe Thr Gin Tyr Trp ser Val Arg 
145 150 155 160 

Gin ser Lys Arg Pro Thr Gly ser Asn Ala Thr lie Thr Phe ser Asn 

165 170 175 

His Val Asn Ala Trp Lys ser His Gly Met Asn Leu Gly ser Asn Trp 

180 185 190 

Ala Tyr Gin Val Met Ala Thr Glu Gly Tyr Gin ser ser Gly Ser Ser 

195 200 205 

Asn Val Thr val Trp 
210 

<210> 189 
<211> 570 
<212> DMA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
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<400> 189 

atggccctta tggcttcgac agactactgg caaaattgga ctgatggtgg tgggacagta 

aatgctacca atggatctga tggcaattac agcgtttcat ggtcaaattg cgggaatttt 

gttgttggta aaggctggac taccggatca gcaactaggg taataaacta taatgccgga 

gccttttcgc cgtccggtaa tggatatttg gctctttatg ggtggacgag aaattcactc 

atagaatatt acgtcgttga tagctggggg acttatagac ctactggaac ttataaaggc 

actgtgacta gtgatggagg gacttatgac atatacacga ctacacgaac caacgcacct 

tccattgacg gcaataatac aactttcacc cagttctgga gtgttaggca gtcgaagaga 

ccgattggta ccaacaatac catcaccttt agcaaccatg ttaacgcctg gaagagtaaa 

ggaatgaatt tggggagtag ttggtcttat caggtattag caacagaggg ctatcaaagt 
agtgggtact ctaacgtaac ggtctggtaa 

<210> 190 
<211> 189 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 190 

Met Ala Leu Met Ala ser Thr Asp Tyr Trp Gin Asn Trp Thr Asp Gly 

1 5 10 15 

Gly Gly Thr Val Asn Ala Thr Asn Gly ser Asp Gly Asn Tyr Ser Val 

20 25 30 

Ser Trp ser Asn cys Gly Asn Phe val val Gly Lys Gly Trp Thr Thr 

35 40 45 

Gly Ser Ala Thr Arg Val lie Asn Tyr Asn Ala Gly Ala Phe ser Pro 

50 55 60 

ser Gly Asn Gly Tyr Leu Ala Leu Tyr Gly Trp Thr Arg Asn Ser Leu 
65 70 75 80 

lie Glu Tyr Tyr Val val Asp ser Trp Gly Thr Tyr Arg Pro Thr Gly 

85 90 95 

Thr Tyr Lys Gly Thr Val Thr Ser Asp Gly Gly Thr Tyr Asp lie Tyr 

100 105 110 

Thr Thr Thr Arg Thr Asn Ala Pro Ser lie Asp Gly Asn Asn Thr Thr 

115 120 125 

Phe Thr Gin Phe Trp ser val Arg Gin ser Lys Arg Pro lie Gly Thr 

130 135 140 

Asn Asn Thr lie Thr Phe Ser Asn His val Asn Ala Trp Lys Ser Lys 
145 150 155 160 

Gly Met Asn Leu Gly Ser Ser Trp ser Tyr Gin Val Leu Ala Thr Glu 

165 170 175 

Gly Tyr Gin Ser Ser Gly Tyr Ser Asn Val Thr Val Trp 
180 185 

<210> 191 
<211> 1053 
<212> DNA 
<213> unknown 



60 
120 
180 
240 
300 
360 
420 
480 
540 
570 



<220> 

<223> Obtained from an environmental sample 



<400> 191 

atgaagtcca 

gcagccgccg 

ttctactaca 

cgctacagct 

cccggtggcc 

tcctacctgg 

agctacggct 

agcgatggcg 

ggcaacaaca 

aacatctccg 

aacctcggca 

agcgacatca 

aagaccatcg 



ttcgcagccg 
gcgcgcaggc 
cgttctggaa 
cccagtggac 
cgcgcgtggt 
cgctgtacgg 
cctacaaccc 
ccacctacaa 
gcaccttcta 
gcacgatcac 
accacgacta 
ccgtgagttc 
tggtgcgcgc 



cagcctcgcc 
gcagacgctc 
ggactcgggc 
cagcaacacc 
caactactcg 
ctggacccgc 
ggccagttgc 
cgtacgccgc 
ccagtacttc 
cgtcgccaac 
catggtgttc 
gggtaccggc 
gcgcggcacc 



accgccgtcc tggctggcgc 
aacaacaatt ccaccggcac 
agcgcctcga tgaccctcca 
aacaactggg tcggcgggaa 
ggctactacg gggtcaacaa 
aatccgctgg tcgagtacta 
gccggcgggg tggactacgg 
tgcctgcgcc agaacgcgcc 
agcgtgcgca atcccaagaa 
cacttcaact actgggccag 
gccaccgagg gctaccagag 
ggcggcggtg gcggcggcaa 
gccggcggag agaacatctc 
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cctcggcgtc 
gcacgacggc 
tccgggcgga 
aggctggaat 
cagccagaac 
cgtgatcgag 
cagcttccag 
gtcgatcgaa 

gggattcggc 
ccgcggcctc 
ccagggcagc 
cacgggcagc 
gctcaaggtc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
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aacaacgcca ccatcgccag ctggacgctc accaccagca tggccaacta cacggccacc 840 

acctcggcat cgggcggctc gctggtggag ttcaccaacg acggcggcaa ccgcgacgtg 900 

caggtggact acctcagcgt caatggcgcc gtccgccagg ccgaggacca gacctacaac 960 

accggcgtgt accagaacgg ccagtgcggc ggcggcaacg gccgcagcga atggctgcac 1020 

tgcaacggtg ccatcggctt cggaaatctc tga " - - 1Q53 

<210> 192 
<211> 350 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(27) 

<400> 192 

Met Lys ser lie Arg Ser Arg ser Leu Ala Thr Ala Val Leu Ala Gly 

1 5 10 IS 

Ala Leu Gly val Ala Ala Ala Gly Ala Gin Ala Gin Thr Leu Asn Asn 

20 25 30 

Asn Ser Thr Gly Thr His Asp Gly Phe Tyr Tyr Thr Phe Trp Lys Asp 

35 40 45 

Ser Gly Ser Ala Ser Met Thr Leu His Pro Gly Gly Arg Tyr ser ser 

50 55 60 

Gin Trp Thr ser Asn Thr Asn Asn Trp val Gly Gly Lys Gly Trp Asn 
65 70 75 80 

Pro Gly Gly Pro Arg Val val Asn Tyr ser Gly Tyr Tyr Gly Val Asn 

85 90 95 

Asn Ser Gin Asn Ser Tyr Leu Ala Leu Tyr Gly Trp Thr Arg Asn Pro 

, , 100 105 HO 

Leu val Glu Tyr Tyr Val He Glu Ser Tyr Gly ser Tyr Asn Pro Ala 

115 120 125 

Ser cys Ala Gly Gly val Asp Tyr Gly Ser Phe Gin Ser Asp Gly Ala 
, 130 135 140 

Thr Tyr Asn Val Arg Arg Cys Leu Arg Gin Asn Ala Pro Ser He Glu 
I 45 L 150 155 160 

Gly Asn Asn ser Thr Phe Tyr Gin Tyr Phe ser Val Arg Asn Pro Lys 

165 170 175 

Lys Gly Phe Gly Asn lie ser Gly Thr He Thr Val Ala Asn His Phe 

180 185 190 

Asn Tyr Trp Ala Ser Arg Gly Leu Asn Leu Gly Asn His Asp Tyr Met 

195 200 205 

Val Phe Ala Thr Glu Gly Tyr Gin ser Gin Gly ser ser Asp lie Thr 

210 215 220 

Val Ser Ser Gly Thr Gly Gly Gly Gly Gly Gly Gly Asn Thr Gly ser 
225 230 235 240 

Lys Thr He Val val Arg Ala Arg Gly Thr Ala Gly Gly Glu Asn lie 

245 250 255 

Ser Leu Lys Val Asn Asn Ala Thr He Ala Ser Trp Thr Leu Thr Thr 

260 265 270 

Ser Met Ala Asn Tyr Thr Ala Thr Thr ser Ala Ser Gly Gly ser Leu 

275 280 285 

val Glu Phe Thr Asn Asp Gly Gly Asn Arg Asp val Gin val Asp Tyr 

290 295 300 

Leu ser val Asn Gly Ala val Arg Gin Ala Glu Asp Gin Thr Tyr Asn 
305 310 315 ' 320 

Thr Gly val Tyr Gin Asn Gly Gin Cys Gly Gly Gly Asn Gly Arg Ser 

325 330 335 

Glu Trp Leu His Cys Asn Gly Ala lie Gly phe Gly Asn Leu 
340 345 350 

<210> 193 
<211> 840 
<212> DNA 
<213> Unknown 

<220> 
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<223> Obtained from an environmental sample 
<400> 193 

atgacgaagt atcggttagg aataggtatt ttcattttgt tggtttgttg cttttcggcg 60 

gcatgtattg tgcctaaaca acaagaggaa caaaaagtgg ctcctacaga attgaccggc 120 

gcgataacat tcacagccaa cagcaacgga aacaagcccc tgaacggctc gccctacggt 180 

tacgaaatat ggacacaggg cgggaccaat aacaaactga tctggttcgg gccggatcag 240 

ggcggcggcg cggctttcag agccgaatgg aacaacccta acgatttttt aggccgcgtg 300 

ggtttttact ggggtaatgg cggaaaatat accgagtaca aaaatatgta tgcggatttt 360 

agctacacta gatctggacg caacaccgcc ggtaattatt catatatagg gatttatggc 420 

tgggctagaa atccaaatgc cgcaaaagaa gaagacaaat tgatagagta ttatattgtg 480 

gaagattggt ttggcaatca atggcaagag gatagctcac ccattaccac taatacaaca 540 

agtggaaccg tattgggaag ttttactata gatggcgcgg tttataatgt cgttagaaat 600 

gtcagagtcc aacaaccttc gatagacgga accaaaacat tcacccaata cttcagcata 660 

cgacaaacgc cccgacagag cgggacaatt tccattaccg ggcatttcag gcaatgggag 720 

agcatgggtt tacagcttgg caatatgtac gaggcaaagt ttcttgttga agccggcggc 780 

ggcacaggat ggctggaatt ttcatacctt aaattaacga tggaagacag cttaaggtaa 840 

<210> 194 
<211> 279 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(21) 

<400> 194 

Met Thr Lys Tyr Arg Leu Gly lie Gly lie Phe lie Leu Leu val cys 

15 10 15 

Cys Phe Ser Ala Ala Cys lie Val Pro Lys Gin Gin Glu Glu Gin Lys 

20 25 30 

Val Ala Pro Thr Glu Leu Thr Gly Ala lie Thr Phe Thr Ala Asn Ser 

35 40 45 

Asn Gly Asn Lys Pro Leu Asn Gly Ser Pro Tyr Gly Tyr Glu lie Trp 

50 55 60 

Thr Gin Gly Gly Thr Asn Asn Lys Leu lie Trp Phe Gly Pro Asp Gin 
65 70 75 80 

Gly Gly Gly Ala Ala Phe Arg Ala Glu Trp Asn Asn Pro Asn Asp Phe 

85 90 95 

Leu Gly Arg Val Gly Phe Tyr Trp Gly Asn Gly Gly Lys Tyr Thr Glu 

100 105 110 

Tyr Lys Asn Met Tyr Ala Asp Phe ser Tyr Thr Arg ser Gly Arg Asn 

115 120 125 

Thr Ala Gly Asn Tyr Ser Tyr He Gly lie Tyr Gly Trp Ala Arg Asn 

130 135 * 140 

Pro Asn Ala Ala Lys Glu Glu Asp Lys Leu lie Glu Tyr Tyr lie Val 
145 150 155 160 

Glu Asp Trp Phe Gly Asn Gin Trp Gin Glu Asp Ser Ser Pro lie Thr 

165 170 175 

Thr Asn Thr Thr Ser Gly Thr val Leu Gly ser Phe Thr lie Asp Gly 

180 185 190 

Ala Val Tyr Asn val val Arg Asn Val Arg val Gin Gin Pro Ser lie 

195 200 " 205 

Asp Gly Thr Lys Thr Phe Thr Gin Tyr Phe ser lie Arg Gin Thr Pro 

210 215 220 

Arg Gin ser Gly Thr lie ser He Thr Gly His Phe Arg Gin Trp Glu 
225 230 235 240 

Ser Met Gly Leu Gin Leu Gly Asn Met Tyr Glu Ala Lys Phe Leu Val 

245 250 255 

Glu Ala Gly Gly Gly Thr Gly Trp Leu Glu Phe Ser Tyr Leu Lys Leu 

260 265 270 

Thr Met Glu Asp Ser Leu Arg 
275 

<210> 195 
<211> 1044 
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<212> DNA 

<213> Unknown 
<220> 

<223> Obtained from an environmental sample 
<400> 195 

atgttcaatc tgaagagagt ggcggcgctc ctgtgcgtcg cagggctggg ggtgtctgcg 60 

gcaaatgcgc agacctgtct caattcgagt gggaccggca ccaacaacgg cttctattat 120 

tccttctgga aagacagtcc gggttcagtg aatttctgca tgtactccgg cggtcgctac 180 

acgtcgagct ggagcggcat caacaactgg gtcggcggca agggctggca aaccggatcg 240 

cgccggacca tcaactactc cggcagcttc aactcgccgg gcaatggcta cctcgcgctc 300 

tacggatgga ccaccaatcc actcgtcgag tactacatcg tcgacaactg gggcacgtat 360 

cgtccgcccg gcggccaggg ctacatgggc acggtcacga gcgacggcgc cacgtacgac 420 

gtctatcgaa cgcaacgagt cgatgcgccg tcgatcattg gtgatcacca gaccttctat 480 

caatactgga gcgtgcgtca gtcgaagagg accggcggaa ccatcaccac cggcaaccac 540 

ttcgatggct gggcgagcta cggcatgaac ctgggaactc acaactacca gatcctggcg 600 

accgagggtt atcaaagcag cggcagctcg gacctcaccg tgagcgaagg cagcagcagt 660 

agcagcagcg gtggcgggag cagttcgagc agcagcggcg gcggtggcac caagagcttc 720 

acggtccgcg cgcgcggcac ggccggtgga gagtcgatca cgttgcgcgt gaataaccag 780 

aacgtgcaga cctggacgct cggcacgagc atgacgaact acacggcgtc gacgtcgctg 840 

agcggcggca tcaccgtggc gttcacgaac gacggtggca accgcgatgt tcaggtggac 900 

tacatcatcg tgaacggcca gacacgccag tcggaagcgc agagctacaa caccgggctc 960 

tacgcgaatg gacgttgcgg cggtggctcg aacagcgagt ggatgcactg caacggcgcg 1020 

attggctacg gaaacacgcc gtaa 1044 

<210> 196 
<211> 347 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(23) 

<400> 196 

Met Phe Asn Leu Lys Arg val Ala Ala Leu Leu Cys Val Ala Gly Leu 

1 5 10 15 

Gly val ser Ala Ala Asn Ala Gin Thr cys Leu Asn Ser ser Gly Thr 

20 25 30 

Gly Thr Asn Asn Gly Phe Tyr Tyr ser phe Trp Lys Asp Ser pro Gly 

35 40 45 

Ser val Asn Phe Cys Met Tyr ser Gly Gly Arg Tyr Thr Ser ser Trp 

50 55 60 

ser Gly lie Asn Asn Trp val Gly Gly Lys Gly Trp Gin Thr Gly ser 
65 70 75 80 

Arg Arg Thr lie Asn Tyr ser Gly ser phe Asn ser Pro Gly Asn Gly 

85 90 95 

Tyr Leu Ala Leu Tyr Gly Trp Thr Thr Asn Pro Leu val Glu Tyr Tyr 

100 105 110 

lie val Asp Asn Trp Gly Thr Tyr Arg pro Pro Gly Gly Gin Gly Tyr 

115 120 125 

Met Gly Thr val Thr ser Asp Gly Ala Thr Tyr Asp val Tyr Arg Thr 

130 135 140 

Gin Arg Val Asp Ala Pro ser lie lie Gly Asp His Gin Thr Phe Tyr 
145 150 155 160 

Gin Tyr Trp Ser Val Arg Gin Ser Lys Arg Thr Gly Gly Thr lie Thr 

165 170 175 

Thr Gly Asn His Phe Asp Gly Trp Ala Ser Tyr Gly Met Asn Leu Gly 

180 185 190 

Thr His Asn Tyr Gin lie Leu Ala Thr Glu Gly Tyr Gin Ser Ser Gly 

195 200 205 

Ser ser Asp Leu Thr val ser Glu Gly Ser Ser Ser Ser Ser ser Gly 

210 215 220 

Gly Gly Ser Ser Ser ser ser ser Gly Gly Gly Gly Thr Lys ser Phe 
225 230 7 235 240 

Thr val Arg Ala Arg Gly Thr Ala Gly Gly Glu Ser lie Thr Leu Arg 
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245 250 255 

Val Asn Asn Gin Asn Val Gin Thr Trp Thr Leu Gly Thr Ser Met Thr 

260 265 270 

Asn Tyr Thr Ala Ser Thr ser Leu Ser Gly Gly He Thr Val Ala Phe 

275 280 285 

Thr Asn Asp Gly Gly Asn Arg Asp val Gin Val Asp Tyr lie lie Val 

290 295 300 

Asn Gly Gin Thr Arg Gin ser Glu Ala Gin ser Tyr Asn Thr Gly Leu 
305 310 315 320 

Tyr Ala Asn Gly Arg Cys Gly Gly Gly ser Asn Ser Glu Trp Met His 

325 330 335 

Cys Asn Gly Ala He Gly Tyr Gly Asn Thr Pro 
340 345 

<210> 197 
<211> 636 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 197 

atgtttaagt tcagtaagaa aatgatgacg gttattcttg cagctaccat gagttttggt 60 

ttatttgcaa caacctcaag tgcagcaacc gactattggc aaaattggac cgatggcggc 120 

ggaacggtta atgctgtaaa cggctccggc ggtaattaca gcgtgacatg gcaaaatacc 180 

ggaaattttg tcgtcggcaa aggctggaat accggatcgc ctaaccgaac cattaactac 240 

aatgccggcg tctgggcgcc ttccggcaat gggtatttga ctctctacgg atggacgaga 300 

aacgcactca ttgaatatta cgtcgtggat agctggggta cttatcggcc tacaggaaca 360 

tataaaggga cggtgacaag tgatgggggc acatatgata tctatacgac catgcggcac 420 

aacgcgcctt ccattgacgg aactcaaacg tttgcccagt actggagtgt tcgacaatcg 480 

aaaagagcga ccggggtcaa ctcctccatt acgttcagca accacgtgaa cgcatgggct 540 

agcaagggaa tgaatctggg aagcagctgg tcatatcagg tgttagctac agagggttat 600 

caaagtagcg gaagctctaa cgtaacagtg tggtaa — g3g 

<210> 198 

<211> 211 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (28) 

<400> 198 

Met Phe Lys Phe Ser Lys Lys Met Met Thr val lie Leu Ala Ala Thr 

1 5 10 15 

Met Ser Phe Gly Leu Phe Ala Thr Thr ser ser Ala Ala Thr Asp Tyr 

20 25 30 

Trp Gin Asn Trp Thr Asp Gly Gly Gly Thr Val Asn Ala Val Asn Gly 

35 40 45 

Ser Gly Gly Asn Tyr ser val Thr Trp Gin Asn Thr Gly Asn Phe val 

50 55 60 

Val Gly Lys Gly Trp Asn Thr Gly Ser pro Asn Arg Thr lie Asn Tyr 
65 70 75 80 

Asn Ala Gly Val Trp Ala Pro Ser Gly Asn Gly Tyr Leu Thr Leu Tyr 

85 90 95 

Gly Trp Thr Arg Asn Ala Leu He Glu Tyr Tyr val Val Asp ser Trp 

100 105 110 

Gly Thr Tyr Arg Pro Thr Gly Thr Tyr Lys Gly Thr val Thr ser Asp 

115 120 125 

Gly Gly Thr Tyr Asp lie Tyr Thr Thr Met Arg His Asn Ala Pro Ser 

130 135 140 

He Asp Gly Thr Gin Thr Phe Ala Gin Tyr Trp Ser Val Arg Gin Ser 
145 150 155 160 

Lys Arg Ala Thr Gly val Asn ser ser lie Thr Phe ser Asn His Val 
165 170 175 
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Asn Ala Trp Ala Ser Lys Gly Met Asn Leu Gly Ser Ser Trp Ser Tyr 

180 185 190 

Gin val Leu Ala Thr Glu Gly Tyr Gin Ser ser Gly Ser ser Asn Val 

195 200 205 

Thr val Trp 
210 

<210> 199 
<211> 1074 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 199 

atgattttcg gtctaaagtc gatcacgggc aggcgcgccg tcgcggcgct ggcctgcctt 60 

gccggcctct acatggcgcc ggcgaatgcg caaacctgca tcacgtcgag ccagacgggc 120 

accaacaacg gcaactactt ttcgttctgg aaagacagcc cgggcacggt gaacttctgc 180 

atgtactccg gcggccgcta cacgtccaac tggagcggca tcaacaactg ggtgggcggc 240 

aagggctggc agacgggctc gtcccgcacc gtctcctact ccggcagctt caattcgccg 300 

ggtaacggct acctgacgct ctacggctgg accaccaatc cgctcatcga gtactacatc 360 

gtcgacaact ggggcagcta tcgtccgccg ggtggccagg gcttcatggg cacggtgaac 420 

accgacggcg gcacgtacga catctatcgc acgcaacggg tcaaccagcc gtcgatcatc 480 

ggcaccgcga cgttctacca gtactggagc gtgcggcagt cgaagcgcac cggcggcacc 540 

atcaccacgg ccaaccactt caatgcctgg gccagcctcg gcatgaacct gggacagcac 600 

aactaccagg tgatggccac cgagggctac cagagcagcg gcagctccga catcacggtg 660 

tgggaaggca cgagcagcgg cggaagcagc aatggcggca gcagcaacgg cggcagcagc 720 

aatggtggca gcggcggcac gaagagcttc acggtgcgcg cgcgcggcac tgcgggcggc 780 

gagtccatca cgctgcgggt caacaaccag aacgtgcaga cctggacgct gggtaccagc 840 

atgcagaact acacggcctc gacctcgctg agcggcggca tcacggtggc gttcaccaac 900 

gacggcggca gccgcgacgt gcaggtggac tacatcatcg tgaatggcca gacccgccag 960 

tccgaacagc agagctacaa cactggcctc tacgccaatg gaagctgtgg tggcggttcg 1020 

aacagcgagt ggatgcattg caacggcgcc atcggctacg gcaatacgcc ctga 1074 

<210> 200 

<211> 354 

<212> PRT 

<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (30) 

<400> 200 

Met lie Phe Gly Leu Lys ser lie Thr Gly Arg Arg Ala val Ala Ala 

1 5 10 15 

Leu Ala Cys Leu Ala Gly Leu Tyr Met Ala Pro Ala Asn Ala Gin Thr 

20 25 30 

cys lie Thr ser Ser Gin Thr Gly Thr Asn Asn Gly Asn Tyr Phe Ser 

35 40 45 

Phe Trp Lys Asp Ser pro Gly Thr val Asn Phe cys Met Tyr ser Gly 

50 55 60 

Gly Arg Tyr Thr Ser Asn Trp Ser Gly lie Asn Asn Trp val Gly Gly 
65 " 70 75 80 

Lys Gly Trp Gin Thr Gly ser Ser Arg Thr Val Ser Tyr Ser Gly Ser 

85 90 95 

Phe Asn ser Pro Gly Asn Gly Tyr Leu Thr Leu Tyr Gly Trp Thr Thr 

100 105 110 

Asn Pro Leu lie Glu Tyr Tyr lie val Asp Asn Trp Gly Ser Tyr Arg 

115 120 125 

Pro Pro Gly Gly Gin Gly phe Met Gly Thr val Asn Thr Asp Gly Gly 

130 135 140 

Thr Tyr Asp He Tyr Arg Thr Gin Arg Val Asn Gin pro ser lie lie 
145 150 155 160 

Gly Thr Ala Thr phe Tyr Gin Tyr Trp ser Val Arg Gin Ser Lys Arg 
165 170 175 
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Thr Gly Gly Thr lie Thr Thr Ala Asn His Phe Asn Ala Trp Ala Ser 

180 185 190 

Leu Gly Met Asn Leu Gly Gin His Asn Tyr Gin Val Met Ala Thr Glu 

195 200 205 

Gly Tyr Gin Ser ser Gly Ser Ser Asp lie Thr Val Trp Glu Gly Thr 

210 215 220 

Ser ser Gly Gly Ser Ser Asn Gly Gly ser ser Asn Gly Gly Ser ser 
225 230 235 240 

Asn Gly Gly Ser Gly Gly Thr Lys ser Phe Thr Val Arg Ala Arg Gly 

245 250 " 255 

Thr Ala Gly Gly Glu Ser lie Thr Leu Arg Val Asn Asn Gin Asn val 

260 265 270 

Gin Thr Trp Thr Leu Gly Thr Ser Met Gin Asn Tyr Thr Ala Ser Thr 

275 280 285 

ser Leu ser Gly Gly lie Thr val Ala Phe Thr Asn Asp Gly Gly Ser 

290 295 300 

Arg Asp Val Gin Val Asp Tyr lie lie Val Asn Gly Gin Thr Arg Gin 
305 310 315 320 

Ser Glu Gin Gin Ser Tyr Asn Thr Gly Leu Tyr Ala Asn Gly ser Cys 

325 330 335 

Gly Gly Gly ser Asn ser Glu Trp Met His cys Asn Gly Ala lie Gly 
340 345 350 

Tyr Gly 

<210> 201 
<211> 1002 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 201 

atgaagatga acagctccct cccctccctc cgcgatgtat tcgcgaatga tttccgcatc 60 

ggggcggcgg tcaatcctgt gacgatcgag atgcaaaaac agttgttgat cgatcatgtc 120 

aacagtatta cggcagagaa ccatatgaag tttgagcatc ttcagccgga agaagggaaa 180 

tttacctttc aggaagcgga tcggattgtg gattttgctt gttcgcaccg aatggcggtt 240 

cgagggcaca cacttgtatg gcacaaccag actccggatt gggtgtttca agatggtcaa 300 

ggccatttcg tcagtcggga tgtgttgctt gagcggatga aatgtcacat ttcaactgtt 360 

gtacggcgat acaagggaaa aatatattgt tgggatgtca tcaacgaagc ggtagccgac 420 

gaaggagacg aattgttgag gccgtcgaag tggcgacaaa tcatcgggga cgattttatg 480 

gaacaagcat ttctctacgc ttatgaagct gacccagatg cactgctttt ttacaatgac 540 

tataatgaat gttttccgga aaagagagaa aaaatttttg cacttgtcaa atcgctgcgt 600 

gataaaggca ttccgattca tggcatcggc atgcaggcgc actggagcct gacccgcccg 660 

tcgcttgatg aaattcgtgc ggcgattgaa cggtatgcgt cccttggtgt tgttcttcat 720 

attacggaac tcgatgtatc catgtttgaa tttcacgatc gtcgaaccga tttggctgtc 780 

ccgacgaacg aaatgatcga acagcaagca gaacggtatg ggcaaatttt tgctttgttt 840 

aaggagtatc gcgatgttat tcaaagtgtc acattttggg gaattgctga tgaccataca 900 

tggctcgata actttccagt gcacgggaga aaaaactggc cgcttttgtt cgatgaacag 960 

cataaaccga aaccagcttt ttggcgggca gtgagtgtct ga " 1002 

<210> 202 
<211> 333 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 202 

Met Lys Met Asn Ser ser Leu Pro ser Leu Arg Asp Val Phe Ala Asn 

15 10 15 

Asp Phe Arg lie Gly Ala Ala val Asn Pro val Thr He Glu Met Gin 

20 25 30 

Lys Gin Leu Leu lie Asp His Val Asn Ser lie Thr Ala Glu Asn His 

35 40 45 

Met Lys Phe Glu His Leu Gin Pro Glu Glu Gly Lys Phe Thr Phe Gin 
50 55 60 
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Glu Ala Asp Arg lie Val Asp Phe Ala cys Ser His Arg Met Ala Val 
65 . 70 75 80 

Arg Gly his Thr Leu Val Trp His Asn Gin Thr Pro Asp Trp val Phe 

85 90 95 

Gin Asp Gly Gin Gly His Phe Val Ser Arg Asp Val Leu Leu Glu Arg 

100 105 ~ 110 

Met Lys cys His lie Ser Thr Val Val Arg Arg Tyr Lys Gly Lys lie 

115 120 ~ 125 

Tyr cys Trp Asp Val lie Asn Glu Ala Val Ala Asp Glu Gly Asp Glu 

130 135 140 

Leu Leu Arg Pro Ser Lys Trp Arg Gin lie lie Gly Asp Asp Phe Met 
145 150 155 160 

Glu Gin Ala Phe Leu Tyr Ala Tyr Glu Ala Asp Pro Asp Ala Leu Leu 

165 170 175 

Phe Tyr Asn Asp Tyr Asn Glu cys Phe Pro Glu Lys Arg Glu Lys lie 

180 185 190 

Phe Ala Leu Val Lys Ser Leu Arg Asp Lys Gly lie Pro lie His Gly 

195 200 205 

He Gly Met Gin Ala His Trp Ser Leu Thr Arg Pro Ser Leu Asp Glu 

210 215 220 

lie Arg Ala Ala lie Glu Arg Tyr Ala ser Leu Gly Val Val Leu His 
225 230 235 240 

He Thr Glu Leu Asp val Ser Met Phe Glu Phe His Asp Arq Arq Thr 

245 250 255 

Asp Leu Ala Val Pro Thr Asn Glu Met lie Glu Gin Gin Ala Glu Arq 

260 265 270 

Tyr Gly Gin lie Phe Ala Leu Phe Lys Glu Tyr Arg Asp Val lie Gin 

275 280 285 

Ser val Thr Phe Trp Gly lie Ala Asp Asp His Thr Trp Leu Asp Asn 

290 295 300 

Phe Pro Val His Gly Arg Lys Asn Trp Pro Leu Leu Phe Asp Glu Gin 
305 310 315 320 

His Lys Pro Lys Pro Ala Phe Trp Arg Ala Val Ser Val 
325 330 

<210> 203 
<211> 687 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 203 

atgaaatctg cacgcgcact tttggtggcg ctatcacgca tacttccgat cgcacttgtg 60 

ctgttgctcg cccccgtccc cgcgcaagcc caacaggtct gcaacaacgg aacgggcacg 120 

cataacggct tcttctggac gttttggaag gacggcggca cggcctgcat gacgctcggc 180 

tcgggcggca attatagcac gacgttcaat ctgtccggcg gccgcaacct tgttgcgggc 240 

aagggctggc agactggctc caccaaccga gtcgtcggtt acaatgcggg cgtctggaac 300 

ccaggcacca attcttatct gacgctctat ggctggtcga cgaatccgct cgtcgaatat 360 

tatgtcgtgg accattgggg cagccaattc accccgccag gcaacggcgc gcagagcatg 420 

gggaccgtga ccaccgacgg cggcacctac aacatctacc gcacccaacg cgtcaacgcg 480 

ccttcgatca tcggcaacgc cacgttctac caatattgga gcgtgcgcac ttcgcgccgc 540 

gggcaaggca cgaacaacac gatcaccttc gccaatcacg tcaacgcttg gcgcagccgc 600 

ggcatgaacc ttgggaccat gaattatcaa gtcatggcca cggaaggttt cggctcgaac 660 

ggaagctcca acctcacagt atggtag 687 

<210> 204 
<211> 228 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C30) 

<400> 204 
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Met Lys Ser Ala Arg Ala Leu Leu val Ala Leu Ser Arg lie Leu Pro 
1 5 10 15 

lie Ala Leu val Leu Leu Leu Ala Pro val Pro Ala Gin Ala Gin Gin 

20 25 30 

val cys Asn Asn Gly Thr Gly Thr His Asn Gly Phe Phe Trp Thr Phe 

35 40 45 

Trp Lys Asp Gly Gly Thr Ala cys Met Thr Leu Gly ser Gly Gly Asn 

50 55 60 

Tyr ser Thr Thr Phe Asn Leu Ser Gly Gly Arg Asn Leu val Ala Gly 
65 70 75 80 

Lys Gly Trp Gin Thr Gly Ser Thr Asn Arg Val Val Gly Tyr Asn Ala 

85 90 95 

Gly Val Trp Asn Pro Gly Thr Asn Ser Tyr Leu Thr Leu Tyr Gly Trp 

100 105 110 

ser Thr Asn Pro Leu val Glu Tyr Tyr val Val Asp His Trp Gly ser 

115 120 125 

Gin Phe Thr Pro Pro Gly Asn Gly Ala Gin ser Met Gly Thr Val Thr 

130 135 140 

Thr Asp Gly Gly Thr Tyr Asn lie Tyr Arg Thr Gin Arg Val Asn Ala 
145 150 155 160 

Pro ser lie He Gly Asn Ala Thr Phe Tyr Gin Tyr Trp Ser Val Arg 

165 170 175 

Thr ser Arg Arg Gly Gin Gly Thr Asn Asn Thr lie Thr Phe Ala Asn 

180 185 190 

His Val Asn Ala Trp Arg ser Arg Gly Met Asn Leu Gly Thr Met Asn 

195 ~ 200 205 

Tyr Gin Val Met Ala Thr Glu Gly Phe Gly ser Asn Gly ser Ser Asn 

210 215 220 

Leu Thr val Trp 
225 

<210> 205 
<211> 1068 
<212> DMA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 205 

atgcaaattt tcaaatcacc actgtcatgg gccggatcac tattactgat cctgtccacc 60 

gccctgtttt caacagcggc cactgcccag gaatactgct ccaaccagac cggtacacac 120 

agcggttttt actttaccca ttggtctgac ggcggcggta ctgcctgcat tactctggga 180 

gacgacggaa attacagtta cacctggtcc aacacaggca attttgtcgg tggcaagggc 240 

tggagtaccg gcacctccaa tcgggtgatc ggttacaacg ccggagacta ctcgccctcc 300 

ggcaactcct acctggcgct gtatggctgg agcaccaatc cactgattga gtactacgtg 360 

gtggatagct ggggtagctg gcgtccgccg ggtggcacct cggtaggtac agtcaccagc 420 

gatggcggga cttacgacct gtaccgcacc gagcgcgtgc agcagccctc catcgaaggc 480 

acggccacct tctatcaata ttggagcgtg cgcacctcac agcgtcccca ggggcagaac 540 

aacaccatca cctttcagaa ccacgtggat gcctgggcca atcagggctg gaacctcggc 600 

acccacaact atcaggtaat ggcgaccgaa ggctacgaaa gcagcggcag ctccaacgtc 660 

acggtttggg attccggcac cagtagcggt aacggtggca acgctggcgg cggtggtggc 720 

gaggcaggta acggctccaa ctcactggtc gtgcgtgcgg tgggcacttc gggcaacgaa 780 

cagttgcgcg tcaacgtcag cggcaacacg gttgaaaccc tgaacctgtc taccaactgg 840 

caggactaca ccatcaacac caacgcttcc ggcgatgtga atgtggagtt gatcaacgat 900 

cagggcgagg gctacgaagc ccgggtggaa tacgtcatcg tcaacggcga tacccgctac 960 

ggcgctgatc agagctacaa caccagcgcc tgggacggcg agtgcggcgg cggttccttt 1020 

accatgtgga tgcactgcga aggcatcctc ggttttggcg atatgtaa 1068 

<210> 206 
<211> 355 
<212> prt 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C29) 
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<400> 206 

Met Gin lie Phe Lys Ser Pro Leu Ser Trp Ala Gly ser Leu Leu Leu 

1 5 10 15 

lie Leu ser Thr Ala Leu Phe ser Thr Ala Ala Thr Ala Gin Glu Tyr 

20 25 30 

cys ser Asn Gin Thr Gly Thr His Ser Gly Phe Tyr Phe Thr His Trp 

35 40 45 

Ser Asp Gly Gly Gly Thr Ala Cys lie Thr Leu Gly Asp Asp Gly Asn 

50 55 60 

Tyr ser Tyr Thr Trp Ser Asn Thr Gly Asn Phe Val Gly Gly Lys Gly 
65 70 75 80 

Trp ser Thr Gly Thr ser Asn Arg Val lie Gly Tyr Asn Ala Gly Asp 

85 90 95 

Tyr ser pro ser Gly Asn ser Tyr Leu Ala Leu Tyr Gly Trp ser Thr 

100 105 110 

Asn Pro Leu lie Glu Tyr Tyr val Val Asp ser Trp Gly ser Trp Arg 

115 120 125 

Pro Pro Gly Gly Thr Ser Val Gly Thr Val Thr ser Asp Gly Gly Thr 

130 135 140 

Tyr Asp Leu Tyr Arg Thr Glu Arg Val Gin Gin Pro Ser lie Glu Gly 
145 150 155 160 

Thr Ala Thr Phe Tyr Gin Tyr Trp ser val Arg Thr Ser Gin Arg Pro 

165 ' 170 175 

Gin Gly Gin Asn Asn Thr lie Thr Phe Gin Asn His Val Asp Ala Trp 

180 185 190 

Ala Asn Gin Gly Trp Asn Leu Gly Thr His Asn Tyr Gin Val Met Ala 

195 200 205 

Thr Glu Gly Tyr Glu ser Ser Gly Ser Ser Asn Val Thr val Trp Asp 

210 215 220 

Ser Gly Thr ser Ser Gly Asn Gly Gly Asn Ala Gly Gly Gly Gly Gly 
225 230 235 240 

Glu Ala Gly Asn Gly Ser Asn Ser Leu Val val Arg Ala val Gly Thr 

245 250 255 

Ser Gly Asn Glu Gin Leu Arg Val Asn Val ser Gly Asn Thr val Glu 

260 265 270 

Thr Leu Asn Leu ser Thr Asn Trp Gin Asp Tyr Thr lie Asn Thr Asn 

275 280 285 

Ala Ser Gly Asp Val Asn val Glu Leu lie Asn Asp Gin Gly Glu Gly 

290 295 300 

Tyr Glu Ala Arg Val Glu Tyr Val lie val Asn Gly Asp Thr Arg Tyr 
305 310 315 320 

Gly Ala Asp Gin Ser Tyr Asn Thr Ser Ala Trp Asp Gly Glu Cys Gly 

325 330 335 

Gly Gly ser Phe Thr Met Trp Met His Cys Glu Gly He Leu Gly Phe 
340 345 350 

Gly Asp Met 
355 

<210> 207 
<211> 633 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 207 

atgaaattaa aaaagaagat gctcacttta ctcctgacgg cttcgatgag tttcggttta 60 

tttggggcaa cctcgagtgc agcaacggat tattggcaat attggacgga tggcggcgga 120 

acggtgaatg cggttaacgg gtccgggggc aattacagcg taacttggca aaatagcggg 180 

aacttcgtgg tcggcaaagg ctggagcgta gggtcgccaa atcggacgat caattacaat 240 

gccggcatct gggaaccttc ggggaacggg tacttgaccc tttacggatg gactagaaac 300 

tcgctgatcg agtattacgt tgtcgacagt tgggggacgt accggccaac aggtactcac 360 

aaaggaacgg tgaacagcga cggaggcacc tacgatattt atacgaccat gcgctataat 420 

gcgccttcca ttgatggcac gcagacgttc caacagttct ggagcgtgcg gcaatcgaaa 480 

cgaccaaccg gcagcaacgt ctccatcacc ttcagcaatc acgtgaatgc ctggagaagc 540 

aagggcatga acctgggcag cagctggtcg taccaggtct tggcgacgga aggctatcag 600 

agcagcggaa gatccaacgt cacggtgtgg taa 633 
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<210> 208 

<211> 210 

<212> PRT 

<213> unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (27) 

<400> 208 

Met Lys Leu Lys Lys Lys Met Leu Thr Leu Leu Leu Thr Ala Ser Met 

15 10 15 

Ser Phe Gly Leu Phe Gly Ala Thr Ser Ser Ala Ala Thr Asp Tyr Trp 

20 25 30 

Gin Tyr Trp Thr Asp Gly Gly Gly Thr val Asn Ala Val Asn Gly ser 

35 40 45 

Gly Gly Asn Tyr Ser Val Thr Trp Gin Asn Ser Gly Asn Phe Val Val 

50 55 60 

Gly Lys Gly Trp Ser Val Gly Ser Pro Asn Arg Thr lie Asn Tyr Asn 
65 70 75 80 

Ala Gly lie Trp Glu pro Ser Gly Asn Gly Tyr Leu Thr Leu Tyr Gly 

85 90 95 

Trp Thr Arg Asn Ser Leu lie Glu Tyr Tyr Val val Asp ser Trp Gly 

100 105 110 

Thr Tyr Arg Pro Thr Gly Thr His Lys Gly Thr Val Asn Ser Asp Gly 

115 120 125 

Gly Thr Tyr Asp He Tyr Thr Thr Met Arg Tyr Asn Ala Pro Ser lie 

130 135 140 

Asp Gly Thr Gin Thr Phe Gin Gin Phe Trp ser val Arg Gin ser Lys 
145 150 155 " 160 

Arg Pro Thr Gly ser Asn val Ser lie Thr Phe Ser Asn His Val Asn 

165 170 175 

Ala Trp Arg Ser Lys Gly Met Asn Leu Gly Ser Ser Trp Ser Tyr Gin 

180 185 190 

val Leu Ala Thr Glu Gly Tyr Gin Ser ser Gly Arg ser Asn Val Thr 
195 200 205 

val Trp 
210 

<210> 209 
<211> 1194 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 209 

atgaaaacat ttagtgtgac caagtctagc gttgttttcg caatggcttt gggtatggct 60 

tcgacagctt ttgctcagga tttctgcagc aatgcgcaac attccggcca aaaggtaacg 120 

attacttcga accaaactgg taaaatcggc gatatcggtt acgaactctg ggacgaaaac 180 

ggtcatggtg gtagtgctac cttctatagc gatggttcca tggactgcaa tatcactggt 240 

gctaaggact atctctgccg tgcgggcctt tccctcggca gtaacaagac ctacaaggaa 300 

cttggtggtg atatgattgc cgagttcaag cttgtgaaga gcggtgccca gaatgtgggt 360 

tactcttata tcggtatcta tggctggatg gaaggtgttt ctggaacgcc tagccagttg 420 

gtcgaatact acgtgattga taacaccctc gccaatgaca tgccgggtag ctggattggt 480 

aacgaaagaa agggtaccat tacggttgac ggcggtacct atactgttta tcgcaatacc 540 

cgtacaggtc cggctattaa gaacagcggt aacgtcacgt tctatcagta tttcagcgtt 600 

cgtacctctc cgcgcgattg cggtaccatc aatatttccg aacacatgag acagtgggaa 660 

aagatgggca tgaccatggg taagctctac gaagccaagg tgcttggcga agcgggtaac 720 

gtgaatggcg aagtccgcgg tggtcacatg gacttcccgc atgctaaggt ttatgtgaaa 780 

aacggctctg atccggcttc ttcctcttct gtgaagtcca gctcttctac agtaacgcca 840 

aaatccagct cctcgaaggg taacggcaac gtttctggta aaattgacgc ctgcaaggac 900 

gctatgggcc atgaaggcaa agaaacgaga actcagggtc agaacaactc tagcgtgacg 960 

ggtaacgtcg gcagctctcc gtaccactat gaaatttggt atcagggtgg taacaactcc 1020 

atgacgttct acgacaacgg tacttataag gcaagctgga atggtaccaa cgacttcctt 1080 
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gctcgtgtcg gtttcaagta tgatgaaaag cacacttacg aagaacttgg ccctatcgat 1140 
gcctactaca agtggagcaa gcagggtagt gctggtggct acaactacat cggt " 1194 

<210> 210 
<211> 398 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(25) 

<400> 210 

Met Lys Thr Phe ser val Thr Lys Ser Ser Val Val Phe Ala Met Ala 
1 n 5 10 15 

Leu Gly Met Ala ser Thr Ala Phe Ala Gin Asp Phe cys ser Asn Ala 

20 25 30 

Gin His ser Gly Gin Lys Val Thr He Thr Ser Asn Gin Thr Gly Lys 

35 40 45 

lie Gly Asp lie Gly Tyr Glu Leu Trp Asp Glu Asn Gly His Gly Gly 

50 55 60 

Ser Ala Thr Phe Tyr ser Asp Gly ser Met Asp cys Asn lie Thr Gly 
65 70 75 80 

Ala Lys Asp Tyr Leu cys Arg Ala Gly Leu ser Leu Gly Ser Asn Lys 

85 90 95 

Thr Tyr Lys Glu Leu Gly Gly Asp Met lie Ala Glu Phe Lys Leu val 

100 105 110 

Lys ser Gly Ala Gin Asn Val Gly Tyr Ser Tyr He Gly lie Tyr Gly 

115 120 125 

Trp Met Glu Gly Val ser Gly Thr pro ser Gin Leu val Glu Tyr Tyr 

130 135 140 

Val lie Asp Asn Thr Leu Ala Asn Asp Met Pro Gly Ser Trp lie Gly 
145 150 155 160 

Asn Glu Arg Lys Gly Thr lie Thr Val Asp Gly Gly Thr Tyr Thr Val 

165 170 175 

Tyr Arg Asn Thr Arg Thr Gly Pro Ala He Lys Asn Ser Gly Asn Val 

180 185 190 

Thr Phe Tyr Gin Tyr Phe Ser Val Arg Thr ser Pro Arg Asp cys Gly 

195 200 205 

Thr He Asn lie Ser Glu His Met Arg Gin Trp Glu Lys Met Gly Met 

210 215 ~ 220 

Thr Met Gly Lys Leu Tyr Glu Ala Lys val Leu Gly Glu Ala Gly Asn 
225 230 235 240 

val Asn Gly Glu Val Arg Gly Gly His Met Asp Phe Pro His Ala Lys 

245 250 255 

Val Tyr Val Lys Asn Gly ser Asp Pro Ala ser ser Ser ser Val Lys 

260 265 270 

Ser Ser Ser Ser Thr val Thr Pro Lys ser ser Ser Ser Lys Gly Asn 

275 280 285 

Gly Asn Val ser Gly Lys He Asp Ala Cys Lys Asp Ala Met Gly His 

290 295 300 

Glu Gly Lys Glu Thr Arg Thr Gin Gly Gin Asn Asn Ser Ser Val Thr 
305 310 315 320 

Gly Asn Val Gly Ser Ser Pro Tyr His Tyr Glu lie Trp Tyr Gin Gly 

325 330 335 

Gly Asn Asn Ser Met Thr Phe Tyr Asp Asn Gly Thr Tyr Lys Ala Ser 

340 345 350 

Trp Asn Gly Thr Asn Asp phe Leu Ala Arg val Gly Phe Lys Tyr Asp 

355 360 365 

Glu Lys His Thr Tyr Glu Glu Leu Gly Pro lie Asp Ala Tyr Tyr Lys 

370 375 380 

Trp Ser Lys Gin Gly ser Ala Gly Gly Tyr Asn Tyr lie Gly 
385 390 395 

<210> 211 
<211> 1086 
<212> DNA 
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<213> Unknown 
<220> 

<223> Obtained from an environmental sample 
<400> 211 

atgataagtt ctaaagcatc acagtcatgg ggctggtcac tattggtggc cctgtccgcc 60 

gttctgcttt cggcgacagc ttccgcccag caacactgct ccaaccaaac cggtacgcac 120 

aacggttttt actttaccca ttggtcagac ggtggcggta ccgcctgcat gactctgggg 180 

gacgacggca actacagcta tacctggtcc aacactggca attttgtcgg tggtaagggc 240 

tggagcacag gtacatccaa ccgggtgatt ggttacaacg ccggagacta ctcgccctcc 300 

ggcaactcct acctggcact gtatggctgg agcaccaatc cgctgattga atattacgtg 360 

gtcgacagtt ggggcagctg gcgtccgccg ggtggcacct ctgtgggcac ggtaaccagc 420 

gacggtggca cttacgacct gtaccgaacc cagcgtgtgc agcagccctc cattgagggt 480 

acggccacct tctatcaata ctggagcgtg cgcacctcac agcggcctca ggggcaaaac 540 

aacaccatca cctttcagaa ccacgtgaat gcctgggcca atcagggctg gaatctgggc 600 

acccacaact atcaggtgat ggcgaccgaa ggctacgaaa gcagcggcag ctccaacgtc 660 

accgtttggg attccggcac cagtagcggt ggcggtggcg gtggcaacgc gggcggcggc 720 

ggagcccccg gtggtggtga ggctggaggc ggctccaact cactggttgt gcgtgcggtg v 780 

ggcacttcgg gcaatgaaca gttgcgcgtc aacgtcagtg gcaacacggt ggaaaccctg 840 

aacctgtcta ccaactggca ggactacacc atcaacacca acgcctccgg cgatgtcaat 900 

gtggaattga tcaacgacca gggcgaaggc tacgaggccc gcgtcgagta cgtcatcatc 960 

aacggcgata cccgctacgg cgccgaccag agctacaaca ccagcgcctg ggacggcgag 1020 

tgcggtagcg gttcctttac catgtggatg cactgcgaag gcatcctcgg ttttggcgat 1080 

atgtaa 1086 

<210> 212 
<211> 361 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(29) 

<400> 212 

Met lie ser Ser Lys Ala Ser Gin Ser Trp Gly Trp Ser Leu Leu Val 

1 5 10 15 

Ala Leu Ser Ala val Leu Leu Ser Ala Thr Ala ser Ala Gin Gin His 

20 25 30 

cys Ser Asn Gin Thr Gly Thr His Asn Gly Phe Tyr Phe Thr His Trp 

35 40 45 

Ser Asp Gly Gly Gly Thr Ala Cys Met Thr Leu Gly Asp Asp Gly Asn 

50 55 60 

Tyr Ser Tyr Thr Trp ser Asn Thr Gly Asn Phe Val Gly Gly Lys Gly 
65 70 75 80 

Trp Ser Thr Gly Thr Ser Asn Arg Val lie Gly Tyr Asn Ala Gly Asp 

85 90 95 

Tyr ser Pro Ser Gly Asn Ser Tyr Leu Ala Leu Tyr Gly Trp Ser Thr 

100 105 110 

Asn Pro Leu He Glu Tyr Tyr val val Asp ser Trp Gly ser Trp Arg 

115 120 125 

Pro Pro Gly Gly Thr ser Val Gly Thr val Thr Ser Asp Gly Gly Thr 

130 135 140 

Tyr Asp Leu Tyr Arg Thr Gin Arg Val Gin Gin Pro Ser lie Glu Gly 
145 150 155 160 

Thr Ala Thr Phe Tyr Gin Tyr Trp Ser Val Arg Thr ser Gin Arg Pro 

165 170 175 

Gin Gly Gin Asn Asn Thr He Thr Phe Gin Asn His val Asn Ala Trp 

n 180 185 190 

Ala Asn Gin Gly Trp Asn Leu Gly Thr His Asn Tyr Gin val Met Ala 

195 200 205 

Thr Glu Gly Tyr Glu ser Ser Gly Ser ser Asn Val Thr val Trp Asp 

210 215 220 

Ser Gly Thr Ser ser Gly Gly Gly Gly Gly Gly Asn Ala Gly Gly Gly 
225 230 235 240 

Gly Ala Pro Gly Gly Gly Glu Ala Gly Gly Gly ser Asn ser Leu val 
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245 



250 



255 



val Arg Ala Val Gly Thr ser Gly Asn Glu Gin Leu Arg val Asn Val 

260 265 270 

Ser Gly Asn Thr Val Glu Thr Leu Asn Leu ser Thr Asn Trp Gin Asp 

275 280 285 

Tyr Thr lie Asn Thr Asn Ala ser Gly Asp val Asn Val Glu Leu lie 

290 295 300 

Asn Asp Gin Gly Glu Gly Tyr Glu Ala Arg Val Glu Tyr Val lie lie 
305 310 315 320 

Asn Gly Asp Thr Arg Tyr Gly Ala Asp Gin ser Tyr Asn Thr Ser Ala 

325 330 335 

Trp Asp Gly Glu cys Gly ser Gly ser phe Thr Met Trp Met His cys 

340 345 350 

Glu Gly He Leu Gly Phe Gly Asp Met 
355 360 

<210> 213 
<211> 912 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 213 

gtgaacgcac aacaaaccct tacgtctaac tccaccggta ctcatggtgg tcactactat 
tctttctgga aggactccgg caatgcgtcc ttcactctct acgatggcgg acgttacggc 
tcgcaatgga atagcggcac caacaattgg gtgggcggta aaggctggaa cccgggcggc 
gcaaaagtcg ttaactacga aggttattac ggcgttaaca attcccagaa ttcttacctg 
gcactctacg ggtggacccg caatccgctg atcgagtact acataatcga aagttacggt 
tcgtacaacc catcgagctg tagtggcggt actaactacg gtagcttcca aagcgatggt 
gcgacctata acgtccgccg ttgccagcgc gtacagcagc catcgattga tggaacgcaa 
acgttctatc agtatttcag cgttcgctca cccaaaaagg gcttcggcca aatcagcggc 
actatcaatg taggcaacca ctttaattat tgggccagca aagggctgaa tttgggtagc 
cacgattaca tggttctggc gactgaaggc tatcagagca gcggcaattc agatatttcc 
gtgtccgaag gcagcagcgg cggctcttcc tcaggcggtt cgacctccag cggaagctcc 
tccggtagta cgaccagttc ttcaggaggc ggtggcggcg gcatcacagt acgtgctcgc 
ggcactaatg gtgatgagcg tatcagcctg cgtgtcggcg gttctgcggt agccagttgg 
acactcagta ccagcgcaca aagctatagc tacacaggcg gcgcctctgg cgatatccag 
gtggaattcg atatcaagct tatcgatacc gtcgacctcg agggggggcc cggtacccaa 
ttcgccctat ag 

<210> 214 

<211> 303 

<212> PRT 

<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 214 

Val Asn Ala Gin Gin Thr Leu Thr ser Asn ser Thr Gly Thr His Gly 

1 . 5 10 15 

Gly His Tyr Tyr ser Phe Trp Lys Asp ser Gly Asn Ala ser Phe Thr 

20 25 30 

Leu Tyr Asp Gly Gly Arg Tyr Gly Ser Gin Trp Asn ser Gly Thr Asn 

35 40 45 

Asn Trp Val Gly Gly Lys Gly Trp Asn Pro Gly Gly Ala Lys Val Val 

50 55 60 

Asn Tyr Glu Gly Tyr Tyr Gly val Asn Asn Ser Gin Asn ser Tyr Leu 
65 70 75 80 

Ala Leu Tyr Gly Trp Thr Arg Asn Pro Leu lie Glu Tyr Tyr lie He 

85 90 95 

Glu ser Tyr Gly Ser Tyr Asn Pro ser ser Cys ser Gly Gly Thr Asn 

100 105 110 

Tyr Gly ser Phe Gin ser Asp Gly Ala Thr Tyr Asn val Arg Arg Cys 

115 n , 120 125 

Gin Arg val Gin Gin Pro Ser He Asp Gly Thr Gin Thr Phe Tyr Gin 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
912 




135 



140 
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Tyr Phe ser Val Arg ser Pro Lys Lys Gly Phe Gly Gin lie ser Gly 
145 150 ' 155 160 

Thr lie Asn Val Gly Asn His Phe Asn Tyr Trp Ala Ser Lys Gly Leu 

165 170 175 

Asn Leu Gly Ser His Asp Tyr Met Val Leu Ala Thr Glu Gly Tyr Gin 

180 185 190 

Ser ser Gly Asn ser Asp lie ser val ser Glu Gly Ser Ser Gly Gly 

195 200 205 

Ser Ser ser Gly Gly Ser Thr Ser Ser Gly ser Ser ser Gly ser Thr 

210 215 220 

Thr ser ser Ser Gly Gly Gly Gly Gly Gly lie Thr Val Arg Ala Ar 
225 230 235 24l 

Gly Thr Asn Gly Asp Glu Arg lie Ser Leu Arg Val Gly Gly ser Ala 

„ „ 245 250 255 

Val Ala ser Trp Thr Leu ser Thr ser Ala Gin Ser Tyr Ser Tyr Thr 

260 265 270 

Gly Gly Ala Ser Gly Asp lie Gin Val Glu Phe Asp lie Lys Leu lie 

275 280 285 

Asp Thr val Asp Leu Glu Gly Gly Pro Gly Thr Gin Phe Ala Leu 
290 295 300 

<210> 215 
<211> 1065 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 215 

atgtttgcaa gattcgagaa actggccgcg gcgggtaaag ccgtcgtggc cctggcaggg 60 

ctcgcccttt tgggcacggc gcctgccaat gcacagacct gtctcacgaa caattccacc 120 

ggcaccaaca acggctacta ctactcgttc tggaaggaca gcggcaacgt gaccttctgc 180 

atgtacgggg gcggccgcta tacctcgcag tggagcaaca tcaacaactg ggtgggcggc 240 

aagggctgga atccgggcgg tcgtcggacc gtcacctatt cggggacgtt caacccgaac 300 

ggcaattcct atctcacgct gtacggctgg accaccaatc cactggtcga gtactacatc 360 

gtcgacagct ggggcagctg gcgtccgccg ggttccggct acatgggttc cgtcacgagc 420 

gacggcggca cctacgacat ctatcgcacg cagcgcgtca accagccctc gatcatcggc 480 

accgcgacgt tctaccagta ctggagcgtg cggcagcaga agcgcgtggg tggcaccatc 540 

accaccggca accacttcga tgcctgggct tcgctgggca tgaacctcgg ccagcacaac 600 

tacatggtca tggccaccga gggctaccag agcagcggca gctccgacat cacggtgggc 660 

ggcaccagca gctcctcgtc gtcgagcggg ggcagcagca gcagtagcag cagcagcggg 720 

ggtggcggct cgaagagctt caccgtgcgc gcgcggggtt cgacgggcgg tgagcagatc 780 

agtttgcgcg tgaacaacca gaccgtgcag aactggacgc tgggcaccag catgcagaac 840 

tacaccgcgt ccaccaacct gagcggcggc atcaccgtgc acttcaccaa tgacagcggc 900 

aaccgcgacg tgcaggtgga ctacatccag gtgaacggcc agacgcgtca atccgagcag 960 

cagagctaca acaccgggct gtatgccaac ggcagctgtg gcggcggcgg ctacagcgag 1020 

tggatgcatt gcaatggcgc gatcggttac ggcaacacgc cgtag 1065 

<210> 216 

<211> 354 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(31) 

<400> 216 

Met Phe Ala Arg Phe Glu Lys Leu Ala Ala Ala Gly Lys Ala val Val 

1 3 10 15 

Ala Leu Ala Gly Leu Ala Leu Leu Gly Thr Ala Pro Ala Asn Ala Gin 

20 25 30 

Thr cys Leu Thr Asn Asn Ser Thr Gly Thr Asn Asn Gly Tyr Tyr Tyr 

35 40 45 

ser Phe Trp Lys Asp Ser Gly Asn val Thr Phe Cys Met Tyr Gly Gly 
50 55 60 
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Gly Arg Tyr Thr Ser Gin Trp ser Asn He Asn Asn Trp Val Gly Gly 
65 70 75 80 

Lys Gly Trp Asn Pro Gly Gly Arg Arg Thr Val Thr Tyr Ser Gly Thr 

85 90 95 

Phe Asn Pro Asn Gly Asn ser Tyr Leu Thr Leu Tyr Gly Trp Thr Thr 

100 105 110 

Asn Pro Leu Val Glu Tyr Tyr lie Val Asp Ser Trp Gly Ser Trp Arq 
115 120 125 y 

Pro Pro Gly ser Gly Tyr Met Gly Ser Val Thr Ser Asp Gly Gly Thr 

130 135 140 

Tyr Asp He Tyr Arg Thr Gin Arg val Asn Gin Pro Ser lie lie Gly 
i 45 , , 150 155 160 

Thr Ala Thr Phe Tyr Gin Tyr Trp ser Val Arg Gin Gin Lys Arg Val 

165 170 175 

Gly Gly Thr He Thr Thr Gly Asn His Phe Asp Ala Trp Ala ser Leu 

180 185 190 

Gly Met Asn Leu Gly Gin His Asn Tyr Met Val Met Ala Thr Glu Gly 

Tyr Gin ser ser Gly ser ser Asp lie Thr Val Gly Gly Thr ser Ser 

210 215 220 

Ser Ser Ser ser ser Gly Gly ser ser ser ser ser Ser ser ser Gly 
225 „ n 230 235 240 

Gly Gly Gly Ser Lys Ser Phe Thr Val Arg Ala Arg Gly ser Thr Gly 

245 250 255 

Gly Glu Gin lie Ser Leu Arg Val Asn Asn Gin Thr Val Gin Asn Trp 

, 260 265 270 

Thr Leu Gly Thr ser Met Gin Asn Tyr Thr Ala Ser Thr Asn Leu Ser 

, , 275 280 285 

Gly Gly lie Thr Val His Phe Thr Asn Asp ser Gly Asn Arg Asp val 

, 2 90 295 300 

Gin val Asp Tyr lie Gin Val Asn Gly Gin Thr Arg Gin ser Glu Gin 
305 310 315 320 

Gin ser Tyr Asn Thr Gly Leu Tyr Ala Asn Gly Ser Cys Gly Gly Gly 

, 325 330 335 

Gly Tyr Ser Glu Trp Met His Cys Asn Gly Ala He Gly Tyr Gly Asn 
340 345 350 

Thr Pro 

<210> 217 
<211> 1083 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 217 

atgactttcg tcaagacgat caccggcaga cgcgccatcg cggcgttcct ctgcctcgcc 60 

ggcctctaca tggcgccggc aaacgcgcaa acctgcatca cgtccagcca gaccggcacc 120 

aacaacggga actacttttc gttctggaaa gacagcccgg gcacggtgaa cttctgcatg 180 

tacccgaatg gccgctacac ctcgaactgg agcggcatca acaactgggt cggcggcaag 240 

ggctggtcga ccggctccag ccgcaccgtc agctattcgg gcagcttcaa ttcgcccggc 300 

aacggctacc tgactctcta cgggtggacc accaacccgc tcatcgagta ctacatcgtc 360 

gagaactggg gtaactaccg cccgccgggc ggccaggggt acatggggac cgtcaattcc 420 

gacggggcga cctatgacat ctaccggacc ttccgggaca accagccctg catcacgggc 480 

aactcctgcg acttctacca gtactggagc gtgcgccagt ccaagcgcag cagcggcacc 540 

atcaccacgg ccaatcactt cgcggcgtgg aacagcctcg gcatgaacct gggccagcac 600 

aactaccagg tcatggccac cgagggttac cagagcagcg gcagctccga catcacggtc 660 

acggaaggcg gcggcggcag cagcaatggt ggcagcagca acggcggcag cagcaatggc 720 

ggcagcagca atggcggcgg cggcggcacc aagagcttca cggtccgcgc ccgtggcacc 780 

gcgggtggcg agtccatcac gctgcgtgtc aacaaccaga acgtgcagac ctggacgctg 840 

ggcaccggca tgcagaacta cacggcctcg acctcgctga gcggtggcat cacggtgcac 900 

ttcaccaacg acggcggaag ccgcgacgtg caggtggact acatccaggt gaacggcagc 960 

acgcgccagt ccgaggcaca gagctacaac accggcgcct acctgaacgg ccgttgcggc 1020 

ggtggcggca acagcgaatg gatgcattgc aacggcgcca tcggctacgg caatacgccc 1080 

tga ** " - - - - 



1083 



<210> 218 
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<211> 360 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (l)..-(29) 

Me^Thr^he val Lys Thr lie Thr Gly Arg Arg Ala lie Ala Ala Phe 
15 10 15 

Leu cys Leu Ala Gly Leu Tyr Met Ala Pro Ala Asn Ala Gin Thr Cys 

20 25 30 

lie Thr ser Ser Gin Thr Gly Thr Asn Asn Gly Asn Tyr Phe ser Phe 

35 40 45 

Trp Lys Asp Ser Pro Gly Thr val Asn Phe cys Met Tyr Pro Asn Gly 

50 55 60 

Arg Tyr Thr Ser Asn Trp Ser Gly He Asn A|n Trp Val Gly Gly Lys 

Gly Trp Ser Thr Gly ser ser Arg Thr Val Ser Tyr Ser Gly Ser Phe 

85 90 95 

Asn ser Pro Gly Asn Gly Tyr Leu Thr Leu Tyr Gly Trp Thr Thr Asn 

100 105 110 

Pro Leu He Glu Tyr Tyr lie val Glu Asn Trp Gly Asn Tyr Arg Pro 

115 * 120 125 

pro Gly Gly Gin Gly Tyr Met Gly Thr Val Asn Ser Asp Gly Ala Thr 

130 135 140 

Tyr Asp lie Tyr Arg Thr Phe Arg Asp Asn Gin Pro Cys lie Thr Gly 
145 150 155 160 

Asn Ser Cys Asp Phe Tyr Gin Tyr Trp ser val Arg Gin ser Lys Arg 

165 170 175 

Ser Ser Gly Thr lie Thr Thr Ala Asn His Phe Ala Ala Trp Asn Ser 

180 185 190 

Leu Gly Met Asn Leu Gly Gin His Asn Tyr Gin val Met Ala Thr Glu 

195 200 205 

Gly Tyr Gin Ser ser Gly Ser Ser Asp He Thr Val Thr Glu Gly Gly 

210 215 220 

Gly Gly ser Ser Asn Gly Gly ser Ser Asn Gly Gly ser ser Asn Gly 
225 230 235 240 

Gly Ser Ser Asn Gly Gly Gly Gly Gly Thr Lys ser Phe Thr Val Arg 

245 250 255 

Ala Arg Gly Thr Ala Gly Gly Glu ser lie Thr Leu Arg Val Asn Asn 

260 265 270 

Gin Asn val Gin Thr Trp Thr Leu Gly Thr Gly Met Gin Asn Tyr Thr 

275 280 285 

Ala ser Thr Ser Leu Ser Gly Gly lie Thr val His Phe Thr Asn Asp 

290 295 300 

Gly Gly ser Arg Asp val Gin val Asp Tyr lie Gin val Asn Gly ser 
305 310 315 320 

Thr Arg Gin ser Glu Ala Gin Ser Tyr Asn Thr Gly Ala Tyr Leu Asn 

325 330 335 

Gly Arg Cys Gly Gly Gly Gly Asn ser Glu Trp Met His cys Asn Gly 

7 340 345 350 

Ala lie Gly Tyr Gly Asn Thr Pro 
355 360 

<210> 219 
<211> 1029 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

atgacatcag gtctcaagaa agtgatggca ttcgtctgtc tcgccaccct tggcgtttcg 60 
gcgcatgccc agacatgtat tcagtccagt cagaccggca ccaacaacgg attctatttc 120 
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tccttctgga aggacaaccc gggcacggtg cagttctgcc tgcagagcgg cggtcgttac 180 

acctccaact ggaacggcat caacaactgg gtgggcggca aggggtggca gaccggcgca 240 

cgccgcacgg tgaactactc gggctcgttc aactcgccgg gcaacggcta tctggcgctg 300 

tacggctgga ccaccaatcc gctggtcgag tactacatcg tcgacagctg gggcagcttc 360 

cgtccgccgg gcaacactgc aggcctgtgg gtactggtga acagcgatgg cggcacctac 420 

gacatctatc gcgcgcatcg cagtaacgcg ccctgcatca ccggcagcag ctgcgacttc 480 

gaccagtact ggagcgtgcg acagtcgaag cgcgtcggcg gcaccatcac caccggcaac 540 

cacttcgatg cctgggcgaa ccaccagatg aatctgggcc agttcaacta ccagatcatg 600 

gctaccgagg gtttccagag caacggcagc tccgacatca ccgtcagtga atgcaccagc 660 

aattgcggcg gtggcggcgg cggcgggggt ggcagcaaca gcatcacggt gcgcgcgcgc 720 

ggcacgggcg gcggcgagca gatccggctg cgggtgaaca acaccacggt gcaaacctgg 780 

acgctgacca ccagctacca gaacttcacg gcttcgacct cgctgagcgg cggcaccatc 840 

gtcgagtact tcaacgacag ttccggccat gacgtgcagg tcgactacat catcgtgaat 900 

ggcgtgaccc gccagtccga atcgcagagc tacaacaccg ggctgtatgc caacgggcgt 960 

tgcggcggcg gctccaacag cgagtggatg cattgcaacg gtgccattgg atacggaaat 1020 

accccgtaa ~ 1029 

<210> 220 
<211> 342 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(23) 

<400> 220 

Met Thr ser Gly Leu Lys Lys Val Met Ala Phe Val Cys Leu Ala Thr 
1 5 10 15 

Leu Gly val ser Ala His Ala Gin Thr cys lie Gin Ser Ser Gin Thr 

20 25 30 

Gly Thr Asn Asn Gly Phe Tyr Phe Ser Phe Trp Lys Asp Asn pro Gly 

35 40 45 

Thr val Gin Phe Cys Leu Gin Ser Gly Gly Arg Tyr Thr ser Asn Trp 

Asn Gly lie Asn Asn Trp val Gly Gly Lys Gly Trp Gin Thr Gly Ala 
65 70 75 80 

Arg Arg Thr val Asn Tyr ser Gly Ser Phe Asn ser Pro Gly Asn Gly 

85 90 95 

Tyr Leu Ala Leu Tyr Gly Trp Thr Thr Asn pro Leu Val Glu Tyr Tyr 

100 105 110 

lie val Asp ser Trp Gly Ser Phe Arg Pro Pro Gly Asn Thr Ala Gly 

115 120 125 

Leu Trp val Leu Val Asn ser Asp Gly Gly Thr Tyr Asp lie Tyr Arg 

130 135 140 

Ala His Arg Ser Asn Ala Pro Cys He Thr Gly ser Ser cys Asp Phe 
145 150 155 160 

Asp Gin Tyr Trp Ser Val Arg Gin Ser Lys Arg Val Gly Gly Thr lie 

u u , ^ 5 . 170 175 

Thr Thr Gly Asn His Phe Asp Ala Trp Ala Asn His Gin Met Asn Leu 

180 185 190 

Gly Gin Phe Asn Tyr Gin lie Met Ala Thr Glu Gly Phe Gin Ser Asn 

195 200 205 

Gly Ser Ser Asp lie Thr Val Ser Glu cys Thr ser Asn cys Gly Gly 

210 215 220 

Gly Gly Gly Gly Gly Gly Gly Ser Asn Ser lie Thr val Arg Ala Arg 
225 230 235 240 

Gly Thr Gly Gly Gly Glu Gin lie Arg Leu Arg Val Asn Asn Thr Thr 

... £45 250 255 

val Gin Thr Trp Thr Leu Thr Thr ser Tyr Gin Asn Phe Thr Ala Ser 

260 265 270 

Thr Ser Leu Ser Gly Gly Thr lie val Glu Tyr Phe Asn Asp ser Ser 

275 280 285 

Gly His Asp Val Gin Val Asp Tyr lie lie Val Asn Gly Val Thr Arg 

290 295 300 

Gin ser Glu ser Gin ser Tyr Asn Thr Gly Leu Tyr Ala Asn Gly Arg 
305 310 315 320 
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Cys Gly Gly Gly Ser Asn Ser Glu Trp Met His Cys Asn Gly Ala lie 

325 330 335 

Gly Tyr Gly Asn Thr Pro 
340 

<210> 221 
<211> 1044 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 221 

atgattgtta gtttcaagag cgtgaaggca ctcgcgtgcc tggccgtgct cggcgtgacc 60 

gccgcgcagg cgcaaacctg catcaattcc agccagaccg gcaccaacaa cggcaattat 120 

ttttcattct ggaaagacaa cccgggcacg gtgaccttct gcatgtatgc caacggccgc 180 

tacacctcca actggagcgg catcaacaac tgggtgggtg gcaagggctg gcagaccggc 240 

tcgaatcgca cggtgaccta ctccggttcg ttcaactcgc ccggcaacgg ctacctcacc 300 

ctgtacgggt ggaccacgaa tccgctgatc gagtactaca tcgtcgacag ttggggcagt 360 

tatcgaccgc ccggcggcca gggcttcatg ggcaccgtga cgaccgacgg cggcacctac 420 

gacatctatc gcacgcagcg cgtgaaccag ccttccatca tcggcaccgc gacgttctac 480 

cagtactgga gcgtgcggca gtcgaagcgc gtggggggca ccatcaccac cgccaaccac 540 

ttcaatgcct gggcgacgct gggcatgaac ctgggccagc acaactacca ggtcatggcc 600 

accgagggtt accagagcag cggcagctcc gacatcaccg tgaccgaagg cggcggcagc 660 

tcgtcgtcgt cgagcggcgg cggcagcacc agcagcggcg gtggcggcag caagagcttc 720 

acggtgcgcg cccgcggcac ggtcggcggc gaaaacatcc agctgcaggt caacaaccag 780 

acggtggcga gctggaacct gaccaccagc atgcagaact acaacgcctc gaccagcctg 840 

agtggcggca tcaccgtggt ctacaccaac gacggcggta accgcgacgt ccaggtcgac 900 

tacatcaccg tgaacggcca gacccgccag tccgaagcgc agagtttcaa caccgggctg 960 

tatgccaacg gacgttgtgg cggcggctcg aacagcgagt ggatgcattg caatggcgcg 1020 

atcggctacg gcaacacgcc gtaa ~ 1044 

<210> 222 
<211> 347 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(24) 

<400> 222 

Met lie val ser Phe Lys ser Val Lys Ala Leu Ala cys Leu Ala val 
1 5 10 15 

Leu Gly val Thr Ala Ala Gin Ala Gin Thr cys lie Asn ser Ser Gin 

20 25 30 

Thr Gly Thr Asn Asn Gly Asn Tyr Phe Ser Phe Trp Lys Asp Asn Pro 

35 40 45 

Gly Thr Val Thr Phe Cys Met Tyr Ala Asn Gly Arg Tyr Thr Ser Asn 

50 55 60 

Trp ser Gly He Asn Asn Trp val Gly Gly Lys Gly Trp Gin Thr Gly 
65 70 75 80 

ser Asn Arg Thr Val Thr Tyr ser Gly ser Phe Asn ser Pro Gly Asn 

85 90 95 

Gly Tyr Leu Thr Leu Tyr Gly Trp Thr Thr Asn Pro Leu lie Glu Tyr 

100 105 110 

Tyr lie Val Asp ser Trp Gly Ser Tyr Arg Pro Pro Gly Gly Gin Gly 

115 120 125 

Phe Met Gly Thr val Thr Thr Asp Gly Gly Thr Tyr Asp He Tyr Arg 

130 135 140 

Thr Gin Arg Val Asn Gin Pro ser lie lie Gly Thr Ala Thr Phe Tyr 
145 150 155 160 

Gin Tyr Trp Ser Val Arg Gin Ser Lys Arg val Gly Gly Thr lie Thr 

165 170 175 

Thr Ala Asn His Phe Asn Ala Trp Ala Thr Leu Gly Met Asn Leu Gly 
180 185 190 
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Gin His Asn Tyr Gin val Met Ala Thr Glu Gly Tyr Gin Ser ser Gly 

195 200 205 

Ser Ser Asp He Thr Val Thr Glu Gly Gly Gly Ser Ser ser ser Ser 

210 215 220 

Ser Gly Gly Gly Ser Thr ser Ser Gly Gly Gly Gly Ser Lys Ser Phe 
225 230 235 240 

Thr Val Arg Ala Arg Gly Thr val Gly Gly Glu Asn lie Gin Leu Gin 

245 250 255 

Val Asn Asn Gin Thr Val Ala Ser Trp Asn Leu Thr Thr ser Met Gin 

260 265 270 

Asn Tyr Asn Ala Ser Thr Ser Leu Ser Gly Gly He Thr Val Val Tyr 

275 280 285 

Thr Asn Asp Gly Gly Asn Arg Asp Val Gin Val Asp Tyr lie Thr val 

290 295 300 

Asn Gly Gin Thr Arg Gin ser Glu Ala Gin ser Phe Asn Thr Gly Leu 
305 310 315 320 

Tyr Ala Asn Gly Arg cys Gly Gly Gly ser Asn Ser Glu Trp Met His 

325 330 335 

Cys Asn Gly Ala lie Gly Tyr Gly Asn Thr Pro 
340 345 

<210> 223 
<211> 642 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 223 

atgtttaagt ttaaaaagaa tttcttagtt ggattatcgg cagctttaat gagtattagc 60 

ttgttttcgg caaccgcctc tgcagctagc acagactact ggcaaaattg gactgatggg 120 

ggcggtatag taaacgctgt caatgggtct ggcgggaatt acagtgttaa ttggtctaat 180 

accggaaatt tcgttgttgg taaaggttgg actacaggtt cgccatttag gacgataaac 240 

tataatgccg gagtttgggc accgaatgga aatggatatt taactttata tggttggacg 300 

agatcacctc tcatagaata ttatgtagtg gattcatggg gtacttatag acctactgga 360 

acgtataaag gtactgtaaa aagtgatggg ggtacatatg acatatatac aactacacgt 420 

tataacgcac cttccattga tggcgatcgc actactttta cgcagtactg gagtgttcgc 480 

caaacgaaga gaccaaccgg aagcaacgct acaatcactt tcagcaatca tgttaacgca 540 

tggaagagcc atggaatgaa tctgggcagt aattgggctt accaagtcat ggcgacagaa 600 

ggatatcaaa gtagtggaag ttctaacgta acagtgtggt aa 642 

<210> 224 
<211> 213 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (D...C28) 

<400> 224 

Met Phe Lys Phe Lys Lys Asn Phe Leu val Gly Leu ser Ala Ala Leu 

15 10 15 

Met ser lie Ser Leu Phe ser Ala Thr Ala Ser Ala Ala ser Thr Asp 

20 25 30 

Tyr Trp Gin Asn Trp Thr Asp Gly Gly Gly lie Val Asn Ala Val Asn 

35 40 45 

Gly ser Gly Gly Asn Tyr ser val Asn Trp ser Asn Thr Gly Asn Phe 

50 55 60 

Val val Gly Lys Gly Trp Thr Thr Gly Ser Pro Phe Arg Thr lie Asn 
65 70 75 80 

Tyr Asn Ala Gly val Trp Ala Pro Asn Gly Asn Gly Tyr Leu Thr Leu 

85 90 95 

Tyr Gly Trp Thr Arg ser Pro Leu lie Glu Tyr Tyr val Val Asp Ser 

100 105 110 

Trp Gly Thr Tyr Arg Pro Thr Gly Thr Tyr Lys Gly Thr Val Lys Ser 
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115 120 125 

Asp Gly Gly Thr Tyr Asp lie Tyr Thr Thr Thr Arg Tyr Asn Ala Pro 

136 135 n 140 

Ser lie Asp Gly Asp Arg Thr Thr Phe Thr Gin Tyr Trp Ser Val Arg 
145 150 155 160 

Gin Thr Lys Arg Pro Thr Gly Ser Asn Ala Thr He Thr Phe Ser Asn 

165 170 175 

His Val Asn Ala Trp Lys Ser His Gly Met Asn Leu Gly Ser Asn Trp 

180 185 190 

Ala Tyr Gin val Met Ala Thr Glu Gly Tyr Gin ser Ser Gly ser Ser 

195 200 205 

Asn Val Thr val Trp 
210 

<210> 225 
<211> 1059 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 



<400> 225 

atgtttgtta 

tcgacttcac 

tactattcct 

cgctacacct 

ggctcacgcc 

accctgtacg 

acgtaccggc 

tacgacgtct 

tatcaatact 

cacttcgacg 

gcgaccgagg 

agttcatcga 

ggcacgaaga 

cgcgtgaaca 

gcatcgacca 

gacgtgcagg 

tacaacaccg 

cactgcaacg 



gtctcaggaa 
aagcccagac 
tctggaagga 
ccaactggag 
ggacgatcag 
gttggaccac 
cgccgggagg 
atcgcaccca 
ggagcgtgcg 
cctgggccgc 
gttaccagag 
gcagcagctc 
gcttcacggt 
accagaacgt 
cgctctccgg 
tggactacat 
gtctctatgc 
ggcagatcgg 



gacggcttgg 
ctgcatcacg 
cagtggcggc 
cggcatcaac 
ctactcgggc 
caatccattg 
ctcgggctac 
gcgcgtaaac 
ccagcagaag 
atacggaatg 
cagcggcagt 
gtcgagcagc 
ccgcgcgcgc 
gcagacctgg 
tggcatcacc 
cgtcgtgaac 
caacggtcgt 
ctacgggaat 



gcgtgcctgt 
tccagcggga 
accgtcaact 
aactgggtgg 
tcgttcaact 
atcgagtact 
atgggcacgg 
cagccttcca 
cggaccggcg 
aacctcggca 
tcggacatca 
agcagttcgt 
ggcacggcgg 
acgctgggca 
gtcgcgtaca 
ggcgccaccc 
tgcggcggcg 
actccctag 



tgctcgccgg 
cgggcaccaa 
tctgcatgta 
gcggcaaggg 
cacccggcaa 
acatcgtcga 
tgacgagcga 
tcatcggcac 
ggaccatcac 
cccacaacta 
cggtgagcga 
cctcttcgag 
gcggtgaatc 
cgtcgatgca 
ccaacgacag 
gccagtccga 
gctccaacag 



cctcggaatc 
caacggccac 
cgcgaacggc 
ctggcagacc 
tggttatctc 
caactggggc 
cggcggcacc 
cgcgacgttc 
caccggcaat 
ccagatcatg 
gggcggtggc 

cggcggcggc 
catcacgctg 
gaactacacc 
cggcaatcgc 
ggcgcagagc 
cgagtggatg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1059 



<210> 226 
<211> 352 
<212> PRT 
<213> Unknown 



<220> 

<223> Obtained from 



an environmental sample 



<221> SIGNAL 
<222> (1)...(25) 



<400> 226 

Met Phe Val Ser Leu 

1 5 
Gly Leu Gly lie ser 
20 

Gly Thr Gly Thr Asn 
35 

Gly Gly Thr val Asn 

Asn Trp Ser Gly lie 

Gly Ser Arg Arg Thr 
85 

Asn Gly Tyr Leu Thr 
100 

Tyr Tyr lie Val Asp 



Arg Lys Thr 

Thr Ser Gin 

Asn Gly His 
40 

Phe Cys Met 
55 

Asn Asn Trp 
70 

lie Ser Tyr 
Leu Tyr Gly 
Asn Trp Gly 



Ala Trp Ala cys Leu Leu 
10 

Ala Gin Thr Cys 
25 

Tyr Tyr Ser Phe 



Tyr Ala Asn Gly 
60 

val Gly Gly Lys 

Ser Gly ser Phe 
90 

Trp Thr Thr Asn 
105 

Thr Tyr Arg Pro 
Page 165 



lie Thr 

30 
Trp Lys 
45 
Arg Tyr 



Gly Trp 

Asn ser 

pro Leu 
110 
pro Gly 



Leu Ala 
15 

ser ser 

Asp Ser 

Thr Ser 

Gin Thr 

80 
pro Gly 
95 

lie Glu 
Gly Ser 
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115 120 n l 125 

Gly Tyr Met Gly Thr Val Thr Ser Asp Gly Gly Thr Tyr Asp val Tyr 

130 135 140 

Arg Thr Gin Arg val Asn Gin Pro ser lie lie Gly Thr Ala Thr Phe 
145 " 150 155 160 

Tyr Gin Tyr Trp ser val Arg Gin Gin Lys Arg Thr Gly Gly Thr lie 

y 165 170 175 

Thr Thr Gly Asn His Phe Asp Ala Trp Ala Ala Tyr Gly Met Asn Leu 

180 185 190 

Gly Thr His Asn Tyr Gin He Met Ala Thr Glu Gly Tyr Gin ser ser 

195 200 205 

Gly Ser Ser Asp lie Thr val ser Glu Gly Gly Gly ser ser ser ser 

210 215 220 

ser ser ser ser ser ser ser ser ser Ser ser Ser Ser Gly Gly Gly 
225 230 235 240 

Gly Thr Lys Ser Phe Thr Val Arg Ala Arg Gly Thr Ala Gly Gly Glu 

245 " 250 255 

ser He Thr Leu Arg Val Asn Asn Gin Asn val Gin Thr Trp Thr Leu 

260 265 t 270 

Gly Thr ser Met Gin Asn Tyr Thr Ala Ser Thr Thr Leu Ser Gly Gly 

275 280 285 

lie Thr val Ala Tyr Thr Asn Asp Ser Gly Asn Arg Asp val Gin Val 

290 295 300 

Asp Tyr lie val Val Asn Gly Ala Thr Arg Gin Ser Glu Ala Gin ser 
305 310 315 320 

Tyr Asn Thr Gly Leu Tyr Ala Asn Gly Arg cys Gly Gly Gly ser Asn 

325 330 335 

Ser Glu Trp Met His cys Asn Gly Gin lie Gly Tyr Gly Asn Thr Pro 
340 345 350 

<210> 227 
<211> 747 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 227 

atgggcggca cgactggtag tggcggctca gccgccgccg gcgcaggcac gagtggaagc 60 

gcgggcggta ccgccggagc gctcggcccc ggcggtaccc agggcagcgg tggcgcagcc 120 

ggtggtacga gcggaacggg cggggccatc agcagcagct gcacggaagc tgacaagacg 180 

gtctgcaaca acgaaaccgg tcgccactgc aattacacgt acgagtattg gaaggaccag 240 

ggaagcggtt gcctcgtgaa caaagccgac ggcttcagcg tcaactggaa caacatcaac 300 

aatctgctgg gtcgcaaggg tctgaggccc ggatcgtcga atcagacggt gacctaccag 360 

gcaaactacc agccgaacgg caattcatac ctgtgcgtat atggatggac gcaaaacccc 420 

ctcgtcgaat actacatcgt cgatagctgg ggcagctggc gcccgccggg gggaacgtcc 480 

atgggcaccg tcaacgcgga cggcggcacc tacgacatct accgcaccca gcgcgtcaac 540 

cagccttcca tcgaaggcac caagaccttc tatcaatact ggagcgttcg cactcagaag 600 

cgcacgagcg gaacgatcac ggttgccgct cacttcgacg cctgggcgac gaaggggatg 660 

aacatgggga gtctgtacga ggtgtcgatg accgtcgagg gctatcaaag cagcgggacc 720 

gccgacgtga gcttctcgat gaagtga 747 

<210> 228 
<211> 248 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (I)... (39) 

<400> 228 n _ _ 

Met Gly Gly Thr Thr Gly ser Gly Gly Ser Ala Ala Ala Gly Ala Gly 

15 10 15 

Thr ser Gly Ser Ala Gly Gly Thr Ala Gly Ala Leu Gly Pro Gly Gly 
20 25 30 
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Thr Gin Gly ser Gly Gly Ala Ala Gly Gly Thr Ser Gly Thr Gly Gly 

35 40 45 

Ala lie ser ser ser cys Thr Glu Ala Asp Lys Thr Val Cys Asn Asn 

50 55 60 

Glu Thr Gly Arg His Cys Asn Tyr Thr Tyr Glu Tyr Trp Lys Asp Gin 
65 70 75 80 

Gly Ser Gly Cys Leu val Asn Lys Ala Asp Gly Phe ser val Asn Trp 

85 90 95 

Asn Asn He Asn Asn Leu Leu Gly Arg Lys Gly Leu Arg Pro Gly ser 

100 105 110 

ser Asn Gin Thr val Thr Tyr Gin Ala Asn Tyr Gin Pro Asn Gly Asn 

115 120 125 

Ser Tyr Leu cys Val Tyr Gly Trp Thr Gin Asn Pro Leu Val Glu Tyr 

130 135 140 

Tyr He Val Asp Ser Trp Gly Ser Trp Arg pro Pro Gly Gly Thr Ser 
145 150 155 160 

Met Gly Thr Val Asn Ala Asp Gly Gly Thr Tyr Asp He Tyr Arg Thr 

165 170 175 

Gin Arg val Asn Gin pro ser He Glu Gly Thr Lys Thr Phe Tyr Gin 

180 185 190 

Tyr Trp ser val Arg Thr Gin Lys Arg Thr ser Gly Thr He Thr val 

195 200 ~ 205 

Ala Ala His Phe Asp Ala Trp Ala Thr Lys Gly Met Asn Met Gly Ser 

210 215 220 

Leu Tyr Glu Val Ser Met Thr Val Glu Gly Tyr Gin ser ser Gly Thr 
225 230 235 240 

Ala Asp Val ser Phe ser Met Lys 
245 

<210> 229 
<211> 642 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 229 

atgtttaagt ttacaaagaa attcttagtt gggttaacgg cagctttgat gagtattagc 60 

ttgttttcgg caaacgcctc tgcagctaac acagactact ggcaaaattg gactgatggg 120 

ggcggaacag taaacgctgt caatgggtct ggcgggaatt acagtgtgaa ttggtctaat 180 

accgggaatt tcgttgttgg taaaggttgg actacaggtt cgccatttag gacgataaac 240 

tataatgccg gagtttgggc gccgaatggc aatgcatatt tgactttata tggttggacg 300 

cgatcacccc tcatagaata ttatgtagtg gattcatggg gtacttatag acctactgga 360 

acgtataaag gtacggttta cagtgatggg ggtacatatg acgtgtacac aactacacgt 420 

tatgatgcac cttccattga tggcgataaa actactttta cgcagtactg gagtgttcgc 480 

cagtcgaaga gaccaactgg aagcaacgct acaatcactt tcagcaatca cgttaacgca 540 

tggaagagat atgggatgaa tctgggtagt aattggtctt accaagtctt agcgacagag 600 

ggatatcaaa gtagtggaag ttctaacgta acagtgtggt aa ~ " 642 

<210> 230 
<211> 213 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(28) 

<400> 230 

Met Phe Lys Phe Thr Lys Lys Phe Leu Val Gly Leu Thr Ala Ala Leu 

1 n 5 10 15 

Met Ser He Ser Leu phe Ser Ala Asn Ala Ser Ala Ala Asn Thr Asp 

, 20 25 30 

Tyr Trp Gin Asn Trp Thr Asp Gly Gly Gly Thr Val Asn Ala val Asn 

35 40 45 

Gly ser Gly Gly Asn Tyr Ser Val Asn Trp Ser Asn Thr Gly Asn Phe 
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50 55 60 

Val Gly Lys Gly Trp Thr Thr Gly Ser Pro Phe ... 3 .... .„ , 
65 70 75 80 

Pro Asn Gly Asn Ala Tyr Leu Thr 
90 95 
Leu lie Glu Tyr Tyr Val val Ast 

105 110 
Gly Thr Tyr Lys Gly Thr val Tyr 
120 125 
, . Tyr Thr Thr Thr Arg Tvr A<:n a1 = 

130 135 140 

lie Asp Gly Asp Lys Thr Thr Phe Thr Gin Tyr ,. r „. 
145 150 155 160 

Ser Asn Ala Thr lie Thr Phe Ser 
170 175 
Tyr Gly Met Asn Leu Gly ser Asn 

185 190 
Glu Gly Tyr Gin ser ser Gly ser 
200 205 

Thr Val Tro 

210 

<210> 231 
<211> 1008 
<212> DNA 
<213> Bacteria 

<400> 231 

atgaacctgc tcgtccagcc gaggcgtcgc agacgcggtc cggtcacctt gctcgtcagg 60 

agcgcgtggg ccgtcgcgct ggcggcgctc gccgcgctga tgctgccggg caccgcccag 120 

gccgacacgg tcgtcacgac caaccaggag ggcaccaaca acggctacta ctactcgttc 180 

tggaccgaca gccagggcac cgtctccatg aacatgggct ccggcggtca gtacagcacc 240 

tcgtggcgca acaccggcaa cttcgtcgcg ggcaagggct gggccaacgg cggccgccgg 300 

accgtgcagt actcgggcag cttcaacccc tccggcaacg cgtacctggc gctctacgga 360 

tggacgtcga acccgctcgt cgagtactac atcgtcgaca actggggcac ctaccggccc 420 

acgggcgagt acaagggcac cgtcaccagc gacggcggca cctacgacat ctacaagacg 480 

acccgcgtca acaagccctc cgtcgagggc acccgcacct tcgaccagta ctggagcgtc 540 

cggcaggcga agcggaccgg cggcaccatc acgaccggca accacttcga cgcgtgggcc 600 

cgggccggga tgccgctcgg caacttcagc tactacatga tcatggccac cgagggctac 660 

cagagcagcg gcagctccag catcaacgtc ggcgggaccg gccgcggcga caacggcggc 720 

ggcgacaacg ggggcggtgg cggcgggtgc accgccacgg tgtccgccgg gcagaagtgg 780 

ggcgaccggt acaacctcga cgtctccgtc agcggcgcca gcgactggac ggtgacgatg 840 

aacgtgccgt ccccggcgaa ggtcctgtcg acctggaacg tcaacgccag ctatcccagt 900 

gcgcagacgc tgaccgccag gtcgaacggc agcggcaaca actggggcgc caccatccag 960 

gccaacggca actggacctg gcccagcgtg tcctgcagcg cgggctga 1008 

<210> 232 

<211> 335 

<212> PRT 

<213> Bacteria 

<220> 

<221> SIGNAL 
<222> (1).-.(41) 

<400> 232 

Met Asn Leu Leu Val Gin Pro Arg Arg Arg Arg Arg Gly Pro Val Thr 

1 5 10 15 

Leu Leu Val Arg ser Ala Trp Ala val Ala Leu Ala Ala Leu Ala Ala 

20 25 30 

Leu Met Leu Pro Gly Thr Ala Gin Ala Asp Thr val val Thr Thr Asn 

35 40 45 

Gin Glu Gly Thr Asn Asn Gly Tyr Tyr Tyr Ser Phe Trp Thr Asp Ser 

50 55 60 

Gin Gly Thr val ser Met Asn Met Gly ser Gly Gly Gin Tyr Ser Thr 
65 70 , 75 gQ 

Ser Trp Arg Asn Thr Gly Asn Phe val Ala Gly Lys Gly Trp Ala Asn 

85 90 95 

Gly Gly Arg Arg Thr val Gin Tyr ser Gly Ser Phe Asn Pro Ser Gly 
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Asn 


A 1 a 


iyr 








Tyr 


Tyr 


He 






Lys 


g iy 


Tnr 


14 j 




Thr 
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Tyr 


Trp 


ser 


g ly 


Asn 


Hi S 




195 


Phe 


Ser 


Tyr 




210 


Ser 


Ser 


Ser 


225 






<j ly 


Acn 
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Gly 


Gin 
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Ala 
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275 


Leu 


Ser 


Thr 




290 




Thr 


Ala 


Arg 


305 




Ala 


Asn 


Gly 



i At i 

LcU 


Tyr 


i y 






1 TA 

1/iU 


Asn 


Trp 


g i y 




135 


Car 


Asp 


Gly 


150 


Pro 


Ser 


Val 


Gin 


Ala 


Lys 


Ala 


Trp 


Ala 




200 


He 


Met 


Ala 




215 




Val 


Gly 


Gly 


»x -> /% 
230 


^ I y 


\a 1 y 


\a i y 


Asp 


Arg 


Tyr 


val 


Thr 


Met 






280 


val 


Asn 


Ala 




295 




Gly 


ser 


Gly 


310 




Thr 


Trp 


Pro 



100 105 110 

ser Asn Pro Leu val Glu 
125 

Arg Pro Thr Gly Glu Tyr 
140 

Tyr Asp lie Tyr Lys Thr 
155 160 
Thr Arg Thr Phe Asp Gin 
165 170 175 

Arg Gin Ala Lys Arg Thr Gly Gly Thr lie Thr Thr 
180 185 190 

Met Pro Leu Gly Asn 
205 

Tyr Gin Ser Ser Gly 
220 

Gly Asp Asn Gly Gly 
235 240 
Thr Ala Thr val ser Ala 
245 ' 250 255 

Gly Asp Arg Tyr Asn Leu Asp Val Ser Val ser Gly 
260 265 270 

Ser Pro Ala Lys Val 
285 

Ser Ala Gin Thr Leu 
300 

Gly Ala Thr lie Gin 
315 320 
Ser cys Ser Ala Gly 
325 330 335 

<210> 233 
<211> 1071 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample 
<400> 233 

atgtctatgt ttttgagtct caaaagagtg gcggcgctcg tctgcgtcgc agggtttggc 60 

atttcggcgg cgaacgctca gtgcgtcact tcgagccaga caggaaccaa caacgggttc 120 

tatttttcgt tctggaaaga tagtccggga accgtgaatt tctgcaacca gagcggtggc 180 

cgctacacat ccaattggag cggtatcaac aactgggtcg gtggcaaggg ttggcagacc 240 

ggctcgcgaa gggtcgtgag ctactccggt tcgttcaatt cgccgggcaa cgggtatctg 300 

accctctatg ggtggaccac caatccgctc atcgagtact acatcgtcga caactggggc 360 

tcgtatcgcc cgccgggcgg acaggggttc atgggcacgg tgaccagcga cggcggcacg 420 

tacgatgtct accgcacaca gcgcgtcaat caaccctgca tcaccggcag cagttgcacc 480 

ttctatcaat actggagcgt gcggcagtcg aagagaacgg gcggcacgat cacgacgggc 540 

aatcactttg acgcgtgggc gagttacggc atgaacctgg gcgctcacaa ctaccagatc 600 

atggcgaccg agggttatca aagcagcggg agctctgaca tcacggtcag tgaaggcagc 660 

agcagtagca gcagtagcag cagttcgagc agtagctcga gcagcagctc cagcagcagc 720 

agcggcggcg gtggcaccaa gagcttcacg gtccgcgcgc gcggcgtggc cggcggggaa 780 

tccatcacgt tgcgcgtgaa caatcagaac gtgcagacct ggactctcgg caccggcatg 840 

cagaactaca cggcgtcgac gtctttgagt ggcggcatca cggttgcgta taccaacgat 900 

ggcggcagtc gcgacgtgca ggttgactac atcatcgtga acggccagac gcgtcagtcg 960 

gaagcgcaga gctacaacac cgggctttat gccaacggcc gttgcggtgg cggcggcaac 1020 

agcgaatgga tgcattgcaa tggcgccatt ggctacggga acacgccgta g 1071 

<210> 234 
<211> 356 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)... (26) 
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<400> 234 

Met Ser Met Phe Leu Ser Leu Lys Arg Val Ala Ala Leu val cys val 

15 10 15 

Ala Gly Phe Gly He Ser Ala Ala Asn Ala Gin cys Val Thr ser Ser 

20 25 30 

Gin Thr Gly Thr Asn Asn Gly Phe Tyr Phe ser Phe Trp Lys Asp Ser 

35 40 45 

Pro Gly Thr Val Asn Phe cys Asn Gin Ser Gly Gly Arg Tyr Thr Ser 

50 55 60 

Asn Trp ser Gly lie Asn Asn Trp val Gly Gly Lys Gly Trp Gin Thr 
65 70 75 80 

Gly Ser Arg Arg Val Val Ser Tyr ser Gly ser Phe Asn Ser Pro Gly 

85 90 95 

Asn Gly Tyr Leu Thr Leu Tyr Gly Trp Thr Thr Asn Pro Leu lie Glu 

100 105 110 

Tyr Tyr lie val Asp Asn Trp Gly ser Tyr Arg Pro Pro Gly Gly Gin 

115 120 125 

Gly Phe Met Gly Thr Val Thr Ser Asp Gly Gly Thr Tyr Asp Val Tyr 

130 135 140 

Arg Thr Gin Arg Val Asn Gin Pro Cys lie Thr Gly Ser Ser cys Thr 
145 150 155 160 

Phe Tyr Gin Tyr Trp Ser Val Arg Gin Ser Lys Arg Thr Gly Gly Thr 

165 170 175 

lie Thr Thr Gly Asn His phe Asp Ala Trp Ala Ser Tyr Gly Met Asn 

180 185 190 

Leu Gly Ala His Asn Tyr Gin rle Met Ala Thr Glu Gly Tyr Gin Ser 

195 200 205 

Ser Gly ser Ser Asp lie Thr Val Ser Glu Gly Ser Ser ser ser Ser 

210 215 220 

ser ser ser ser ser ser ser Ser Ser ser Ser Ser Ser Ser ser Ser 
225 230 235 240 

Ser Gly Gly Gly Gly Thr Lys ser Phe Thr Val Arg Ala Arg Gly Val 

245 250 ^ 255 

Ala Gly Gly Glu Ser lie Thr Leu Arg val Asn Asn Gin Asn val Gin 

260 265 270 

Thr Trp Thr Leu Gly Thr Gly Met Gin Asn Tyr Thr Ala Ser Thr Ser 

275 280 285 

Leu ser Gly Gly lie Thr val Ala Tyr Thr Asn Asp Gly Gly Ser Arg 

290 295 300 

Asp val Gin Val Asp Tyr lie rle val Asn Gly Gin Thr Arg Gin Ser 
305 310 315 320 

Glu Ala Gin Ser Tyr Asn Thr Gly Leu Tyr Ala Asn Gly Arg cys Gly 

325 330 335 

Gly Gly Gly Asn Ser Glu Trp Met His Cys Asn Gly Ala lie Gly Tyr 

340 345 350 

Gly Asn Thr Pro 
355 

<210> 235 
<211> 1539 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 235 

atgtcgaata acagatttgt gctgaatcgt gttgctgcag gtttgctgct gggtttctcg 60 

ctgctgtcat cagcagccat cgcccagaat gtggtggtaa atccttctac ggtccatcag 120 

accgtgcgcg gctttggcgg catgaacgcg ccgggctgga ttgatgacct taccaccgcc 180 

caggtcaata aggcctatgg cagtggcgat ggccaggtcg ggctctccat catgcgcatg 240 

cgcattgatc cgaactcggc agcctggaat atccaggtgc cggctgccaa gcgggccaag 300 

gagctgggtg cgatcctgtt tgccacgccc tggtcgccgc ccgcctacat gaaatccaac 360 

aaaagcctga ataacggcgg caagctgctg cccgagtatt acagcgccta caccacccac 420 

ctgctggatt ttgcgagttt catgtcgcgc aacggcgcac cgctgtatgc gatttcaatc 480 

cagaacgaac cggactggct gccggattat gagtcgtgtg cctggactgg tactgatttc 540 

gtcaattatc tgaataccca gggctcgcgt tttggtgatc tgaaagtgat tgcgccggaa 600 

tccctgggtt tcacgacctc gtattccgac cccatcctca acagcgccac ggcagcgccg 660 

catgtcgaca tcatcggcgg ccacctctac ggcgtgctgc ccaaggacta cccgctggcg 720 
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cgccagaagg gcaaggaaat ctggatgacc gagcattaca ccgagagcaa gaactcgggt 780 

gatgcctggc cgctggcgct ggacgtaggc accgagctgc accagagcat ggtggccaac 840 

tacaacgcct acgtgtggtg gtatgtgcgc cgcagctacg gcctgctgct ggagaacggc 900 

aatgtgagca agcgcggcta catcatgtcg cagtacgcac gcttcgtccg ccccggctcc 960 

aagcgcatcg gcgcgacgga aaagccgcac gccgacgtgg cggtgacggc ctacaagacg 1020 

ccggataacc gcattgtgct ggtggcggtg aataccggtg cggcgcaccg tcagctgaac 1080 

atcacggtgc cgagcggcag cgtgggttct ttcagcaagt tctccacttc cggcacgctg 1140 

aatgtgggca gtggtggcag ctacaaggtc aacaacggcg cggtgagcct gtacatcgat 1200 

ccgcagagcg tggccacgct ggtgggtgat ctgcccggca cggcctccag ctcttcggcg 1260 

gcgtcctcgt cctcttccag tgcagccagc tctgcttcga gcagtgctag cggcgcaccg 1320 

gccctgtctg gcagcagcga ttaccccacg ggcttcagca agtgcgctga tctgggtggt 1380 

acttgtgccg tgccttcggg ctcgggctgg acggctttcg ggcgcaaggg caagtgggtt 1440 

gccaagtacg tcggtgtggg caagagcatt gcctgcacgg tgacggcttt cggcagcgat 1500 

cccggtggtg cacccaacaa gtgttcttac cagaagtaa ^ 1539 

<210> 236 
<211> 512 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...C28) 

<400> 236 

Met ser Asn Asn Arg Phe val Leu Asn Arg val Ala Ala Gly Leu Leu 
1 n L 5 10 15 

Leu Gly phe ser Leu Leu ser ser Ala Ala lie Ala Gin Asn Val Val 

20 25 30 

Val Asn Pro ser Thr val His Gin Thr val Arg Gly Phe Gly Gly Met 

35 40 45 

Asn Ala Pro Gly Trp He Asp Asp Leu Thr Thr Ala Gin val Asn Lys 

50 55 60 

Ala Tyr Gly ser Gly Asp Gly Gin Val Gly Leu Ser He Met Arg Met 
65 70 75 80 

Arg lie Asp Pro Asn ser Ala Ala Trp Asn lie Gin val Pro Ala Ala 

85 90 95 

Lys Arg Ala Lys Glu Leu Gly Ala He Leu Phe Ala Thr pro Trp ser 

100 105 110 

Pro Pro Ala Tyr Met Lys ser Asn Lys Ser Leu Asn Asn Gly Gly Lys 

115 120 125 

Leu Leu Pro Glu Tyr Tyr Ser Ala Tyr Thr Thr His Leu Leu Asp Phe 

, 130 135 140 

Ala ser phe Met Ser Arg Asn Gly Ala pro Leu Tyr Ala lie Ser He 
145 150 155 160 

Gin Asn Glu Pro Asp Trp Leu Pro Asp Tyr Glu ser cys Ala Trp Thr 

165 170 175 

Gly Thr Asp Phe Val Asn Tyr Leu Asn Thr Gin Gly Ser Arg phe Gly 

180 185 190 

Asp Leu Lys val lie Ala pro Glu Ser Leu Gly Phe Thr Thr ser Tyr 

195 200 205 

Ser Asp Pro lie Leu Asn ser Ala Thr Ala Ala Pro His val Asp He 

, 210 215 220 

lie Gly Gly His Leu Tyr Gly Val Leu Pro Lys Asp Tyr pro Leu Ala 
225 230 235 240 

Arg Gin Lys Gly Lys Glu He Trp Met Thr Glu His Tyr Thr Glu Ser 

245 250 255 

Lys Asn ser Gly Asp Ala Trp Pro Leu Ala Leu Asp val Gly Thr Glu 

260 265 270 

Leu His Gin Ser Met Val Ala Asn Tyr Asn Ala Tyr val Trp Trp Tyr 

Val Arg Arg Ser Tyr Gly Leu Leu Leu Glu Asn Gly Asn val ser Lys 

290 295 300 

Arg Gly Tyr lie Met ser Gin Tyr Ala Arg Phe Val Arg Pro Gly Ser 
305 310 315 320 

Lys Arg He Gly Ala Thr Glu Lys Pro His Ala Asp val Ala Val Thr 
325 330 335 
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Ala Tyr Lys Thr Pro Asp Asn 
340 

Gly Ala Ala His Arg Gin Leu 
355 

Gly Ser Phe ser Lys Phe ser 
370 375 
Gly Gly ser Tyr Lys Val Asn 
385 390 
Pro Gin ser Val Ala Thr Leu 
405 

ser ser ser Ala Ala ser ser 
420 

Ser Ser ser Ala Ser Gly Ala 
435 

Pro Thr Gly Phe Ser Lys cys 
450 455 
Pro Ser Gly Ser Gly Trp Thr 
465 470 
Ala Lys Tyr Val Gly val Gly 
485 

Phe Gly Ser Asp Pro Gly Gly 
500 

<210> 237 
<211> 1269 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 237 

atgattccac gcataaaaaa aacaatttgt gtactattag tatgtttcac tatgctgtca 60 

gtcatgttag ggccaggcgc tactgaagtt ttggcagcaa gtgatgtaac agttaatgta 120 

tctgcagaga aacaagtgat tcgcggtttt ggagggatga atcatccggc ttgggctggg 180 

gatcttacag cagctcaaag agaaactgct tttggcaatg gacagaacca gttaggattt 240 

tcaatcttaa gaattcatgt agatgaaaat cgaaataatt ggtataaaga ggtggagact 300 

gcaaagagtg cggtcaaaca cggagcaatc gtttttgctt ctccttggaa tcctccaagt 360 

gatatggttg agacctttaa tcggaatggt gacacatcgg ctaaacggct gaaatacaac 420 

aagtacgcag catacgcgca gcatcttaac gattttgtta ccttcatgaa gaataatggt 480 

gtgaatcttt acgcgatttc ggtccaaaac gagcctgatt acgctcacga gtggacgtgg 540 

tggacgccgc aagaaatact tcgctttatg agagaaaacg ccggctcgat caatgcccgc 600 

gtcattgcgc ctgagtcatt tcaatacttg aagaatttgt cggacccgat cttgaacgat 660 

ccgcaggctc ttgccaatat ggatattctc ggaactcacc tgtacggcac ccaggtcagc 720 

caattccctt atcctctttt caaacaaaaa ggagcgggga aggacctttg gatgacggaa 780 

gtatactatc caaacagtga taccaactcg gcggatcgat ggcctgaggc attggatgtt 840 

tcacagcata ttcacaatgc gatggtagag ggggactttc aagcttatgt atggtggtac 900 

atccgaagat catatggacc tatgaaagaa gatggtacga tcagcaaacg cggctacaat 960 

atggctcatt tctcaaagtt tgtgcgtccc ggctatgtaa ggattgatgc aacgaaaaac 1020 

cctaatgcga acgtttacgt gtcagcctat aaaggtgaca acaaggtcgt tattgttgcc 1080 

atcaataaaa gcaacacagg agtcaaccaa aactttgttt tgcagaatgg atctgcttca 1140 

aacgtatcta gatggatcac gagcagcagc agcaatctac aacctggaac gaatctcact 1200 

gtatcaggca atcatttttg ggctcatctt ccagctcaaa gcgtgacaac atttgttgta 1260 

aatcgttaa " ~ 1269 

<210> 238 
<211> 422 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(32) 

<400> 238 

Met lie Pro Arg He Lys Lys Thr He cys Val Leu Leu Val cys Phe 
15 io 15 
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Arg lie 
345 
Asn lie 
360 

Thr Ser 

Asn Gly 

Val Gly 

ser Ser 
425 
Pro Ala 
440 

Ala Asp 

Ala Phe 

Lys Ser 

Ala Pro 
505 



Val Leu 

Thr Val 

Gly Thr 

Ala Val 
395 
Asp Leu 
410 

Ser Ser 

Leu ser 

Leu Gly 

Gly Arg 
475 
lie Ala 
490 

Asn Lys 



Val Ala Val 
350 

Pro Ser Gly 

365 
Leu Asn Val 
380 

Ser Leu Tyr 

Pro Gly Thr 

Ala Ala Ser 
430 

Gly Ser Ser 

445 
Gly Thr Cys 
460 

Lys Gly Lys 

cys Thr Val 

cys ser Tyr 
510 



Asn Thr 

ser val 

Gly Ser 

lie Asp 
400 
Ala ser 
415 

Ser Ala 

Asp Tyr 

Ala val 

Trp val 
480 
Thr Ala 
495 

Gin Lys 
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Thr Met Leu ser Val Met Leu Gly Pro Gly Ala Thr Glu Val Leu Ala 

20 25 30 

Ala Ser Asp val Thr Val Asn val Ser Ala Glu Lys Gin Val lie Arg 

35 40 45 

Gly Phe Gly Gly Met Asn His Pro Ala Trp Ala Gly Asp Leu Thr Ala 

50 55 60 

Ala Gin Arg Glu Thr Ala Phe Gly Asn Gly Gin Asn Gin Leu Gly Phe 
65 70 75 80 

Ser lie Leu Arg lie His Val Asp Glu Asn Arg Asn Asn Trp Tyr Lys 

n . 85 90 9 5 

Glu Val Glu Thr Ala Lys Ser Ala val Lys His Gly Ala He Val Phe 

100 105 110 

Ala Ser Pro Trp Asn Pro Pro ser Asp Met val Glu Thr Phe Asn Arg 

115 120 125 

Asn Gly Asp Thr Ser Ala Lys Arg Leu Lys Tyr Asn Lys Tyr Ala Ala 

130 135 140 

Tyr Ala Gin His Leu Asn Asp Phe Val Thr Phe Met Lys Asn Asn Gly 
145 150 155 160 

Val Asn Leu Tyr Ala lie ser val Gin Asn Glu Pro Asp Tyr Ala His 

165 170 175 

Glu Trp Thr Trp Trp Thr Pro Gin Glu lie Leu Arg Phe Met Arg Glu 

180 185 190 

Asn Ala Gly Ser lie Asn Ala Arg Val lie Ala Pro Glu Ser Phe Gin 

195 200 205 

Tyr Leu Lys Asn Leu ser Asp Pro He Leu Asn Asp Pro Gin Ala Leu 

210 215 220 

Ala Asn Met Asp lie Leu Gly Thr His Leu Tyr Gly Thr Gin Val ser 
225 230 235 240 

Gin Phe Pro Tyr Pro Leu Phe Lys Gin Lys Gly Ala Gly Lys Asp Leu 

245 250 255 

Trp Met Thr Glu Val Tyr Tyr Pro Asn Ser Asp Thr Asn Ser Ala Asp 

260 265 270 

Arg Trp pro Glu Ala Leu Asp val Ser Gin His He His Asn Ala Met 

275 280 285 

Val Glu Gly Asp Phe Gin Ala Tyr Val Trp Trp Tyr lie Arg Arg Ser 

290 295 300 

Tyr Gly Pro Met Lys. Glu Asp Gly Thr lie Ser Lys Arg Gly Tyr Asn 
305 310 315 320 

Met Ala His Phe Ser Lys Phe Val Arg Pro Gly Tyr Val Arg lie Asp 

, , 325 330 335 

Ala Thr Lys Asn Pro Asn Ala Asn val Tyr Val Ser Ala Tyr Lys Gly 

340 345 350 

Asp Asn Lys Val Val lie Val Ala lie Asn Lys Ser Asn Thr Gly Val 

n 355 360 365 

Asn Gin Asn Phe Val Leu Gin Asn Gly Ser Ala Ser Asn val ser Arg 

370 375 380 

Trp lie Thr ser ser Ser Ser Asn Leu Gin Pro Gly Thr Asn Leu Thr 
385 390 395 400 

val ser Gly Asn His Phe Trp Ala His Leu Pro Ala Gin Ser Val Thr 

, , , 405 410 415 

Thr Phe Val Val Asn Arg 
420 

<210> 239 
<211> 1281 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 
<400> 239 

atgaatcgtt tcttgatttc acgttataag aaagccataa gtgcatgttt ggcccttgtc 60 
cttgcgttgt ctctcatggc ggcacctggc gatgttgccg cagccagcga cgccgttata 120 
aatgtatcgt cggagaaaca agtgatacgc ggtttcggag gcatcaacca cccggcatgg 180 
atcggagatt tgacggcagc acagagagaa accgcatttg ggaacgggcc aaatcagtta 240 
ggcttctcga tattaagaat ctacgtgcat gaagaccgaa atcagtggca ccgtgaactq 300 
gatacggcca aacgagcgat tgcccttgga gctatcgtat tcgcttcgcc atggaatccg 360 
cccgccgaca tggtcgagac cttcaaccgc aacggcgata cgtcggcaaa gcgacttcgt 420 
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tacgacaagt ataccgccta tgcccagcat cttaacgatt tcgtaaccta catgagaaac 480 

aatggcgtga atctctacgc gatttccgtc cagaacgagc ccgattatgc gcatgactgg 540 

acgtggtgga ctccgcagga aatgcttcgc tttatgaaag aaaatgccgg atcgatcaac 600 

agcagagtga tcgcaccgga atcgttccaa tatctgaaaa atatgtcgga cccgattcta 660 

aatgatcccc aggcgcttgc caatatggat attcttggcg ctcatctgta cggtacccaa 720 

gttagcaatt tcgcgtatcc actattcaaa caaaaaggag cgggaaaaga cctctggatg 780 

accgaggtgt attacccgaa cagcgacaac aactcggcgg atcgctggcc cgaagccctg 840 

gatgtgtctt accatatcca caatgcgatg gtagagggag attttcaagc ttatgtatgg 900 

tggtatatcc gcagatccta tggtccaatg aaagaggacg gcacgatcag caaacgcggc 960 

tacaatatgg ctcatttctc caagtttgtc cgtcccggct atgtcagggt ggatgcttcg 1020 

aaaaatccag aaacgaacgt ttacgtatcc gcatataaag gcgacaacaa aatcgttatc 1080 

gttgccataa accggaacaa ctccggggtc aatcagaact ttgtccttca gaatggatcc 1140 

gtttcgcagg tatcaaggtg gatcacgagc agcagcagca atctccagcc aggaacgtct 1200 

ctcaatgtaa cagggagcaa tttctgggct catcttcccg cgcaaagcgt tacgactttt 1260 

gtgggtgaac tcggaaggta a 1281 

<210> 240 
<211> 426 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<223> SIGNAL 
<222> (I)... (30) 

<400> 240 

Met Asn Arg Phe Leu lie Ser Arg Tyr Lys Lys Ala lie ser Ala cys 
1 „ 5 10 15 

Leu Ala Leu Val Leu Ala Leu Ser Leu Met Ala Ala Pro Gly Asp val 

20 25 30 

Ala Ala Ala ser Asp Ala Val lie Asn Val Ser Ser Glu Lys Gin Val 

35 40 45 

lie Arg Gly Phe Gly Gly lie Asn His Pro Ala Trp He Gly Asp Leu 

. 5 9 . . 55 60 

Thr Ala Ala Gin Arg Glu Thr Ala Phe Gly Asn Gly Pro Asn Gin Leu 
65 70 75 80 

Gly Phe Ser lie Leu Arg lie Tyr Val His Glu Asp Arg Asn Gin Trp 

85 90 95 

His Arg Glu Leu Asp Thr Ala Lys Arg Ala lie Ala Leu Gly Ala He 

- u 100 105 110 

Val Phe Ala ser Pro Trp Asn Pro Pro Ala Asp Met Val Glu Thr Phe 

115 120 125 

Asn Arg Asn Gly Asp Thr ser Ala Lys Arg Leu Arg Tyr Asp Lys Tyr 

Thr Ala Tyr Ala Gin His Leu Asn Asp Phe Val Thr Tyr Met Arg Asn 
145 150 155 y 160 

Asn Gly val Asn Leu Tyr Ala lie Ser val Gin Asn Glu Pro Asp Tyr 

_ . 165 170 175 

Ala His Asp Trp Thr Trp Trp Thr Pro Gin Glu Met Leu Arg Phe Met 

180 185 190 

Lys Glu Asn Ala Gly ser lie Asn ser Arg Val lie Ala pro Glu Ser 

195 200 205 

Phe Gin Tyr Leu Lys Asn Met Ser Asp Pro lie Leu Asn Asp Pro Gin 
, 210 215 220 

Ala Leu Ala Asn Met Asp He Leu Gly Ala His Leu Tyr Gly Thr Gin 
225 230 235 240 

Val ser Asn Phe Ala Tyr Pro Leu Phe Lys Gin Lys Gly Ala Gly Lys 

245 250 255 

Asp Leu Trp Met Thr Glu Val Tyr Tyr pro Asn Ser Asp Asn Asn ser 

260 265 270 

Ala Asp Arg Trp Pro Glu Ala Leu Asp val Ser Tyr His lie His Asn 

275 280 285 

Ala Val Glu Gl y AS P phe Gln Ala Tyr Val Trp Trp Tyr lie Arg 

290 295 300 

Arg ser Tyr Gly Pro Met Lys Glu Asp Gly Thr lie ser Lys Arg Gly 
305 m 310 H 315 320 

Tyr Asn Met Ala His Phe Ser Lys Phe val Arg Pro Gly Tyr val Arg 
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325 330 n 335 

val Asp Ala Ser Lys Asn Pro Glu Thr Asn Val Tyr val ser Ala Tyr 

340 345 350 

lys Gly Asp Asn Lys lie Val lie val Ala lie Asn Arg Asn Asn ser 

355 360 365 

Gly val Asn Gin Asn Phe Val Leu Gin Asn Gly Ser Val Ser Gin Val 

370 375 380 

ser Arg Trp lie Thr Ser ser Ser Ser Asn Leu Gin Pro Gly Thr ser 
385 390 395 400 

Leu Asn Val Thr Gly ser Asn Phe Trp Ala His Leu Pro Ala Gin Ser 

405 410 415 

Val Thr Thr Phe val Gly Glu Leu Gly Arg 
420 425 



<210> 241 
<211> 1695 
<212> DNA 
<213> Unknown 

< 220> 

<223> Obtained from an environmental sample 



<400> 241 

gtgaagatat 

ttaacagccc 

attaatttgt 

tggattggcg 

cttggttttt 

gtggctactg 

ccaccaagtg 

agatacgaca 

aataacggtg 

tggacctggt 

cagaatacca 

attctgaatg 

acacagttca 

tggatgacag 

gcattggacg 

gtatggtggt 

cgtggttaca 

gcaaccaaaa 

gtcattgtgg 

ggtaacgctt 

gcgccgatta 

acgtttgtag 

gcggaaacgg 

gggaccggat 

aataacacga 

cgtaatctcg 

acaggcagct 

aatacgatta 

actgcagtcc 



tgaaatttaa 
tgcctctcat 
cctccgaaaa 
acttgacggc 
ccatactaag 
ccaaaagagc 
acatggtcga 
aatatgctgc 
tggatctgta 
ggactccgca 
aagtcatggc 
atcctcaggc 
aagatttcgc 
aagtgtatta 
tatcttacca 
atattcggag 
atatggccca 
atccggatac 
cgattaatcg 
ctactgtatc 
cgatgtcagg 
ccaacattac 
gcactacact 
acgtgaactt 
taacaggtac 
acattttcgt 
ggtcgacctg 
aagtggtcac 
aataa 



gatgaattta 
gttaacgccg 
acagcttatc 
agctcagcgt 
aatctatgtc 
catagagcaa 
aaccttcaat 
gtacgcgcag 
tgccatttcg 
ggagatcctt 
acctgaatcg 
actcgccaat 
atacccgctc 
cccgaacagc 
tatgcataat 
acagtacggt 
tttctccaaa 
caataccttc 
cggcacctcg 
ctcttgggtt 
tggagccttt 
tggtggtagt 
taccgatgcc 
taatgcgtat 
caaaaatgtg 
taacggaact 
gagtgaaaaa 
aaccggtaca 



aaaaaatcgg 
acacacgtat 
aaggggtttg 
gaaacagcct 
gatgaaaatc 
ggtgccatcg 
cggaacgggg 
catctgaacg 
gtacaaaatg 
cgtttcatga 
ttccagtatt 
atggacattc 
tttaagcaaa 
gataacaact 
gccatggttg 
ccgatgaatg 
tttgtgcgac 
gtctcagcct 
gctgtaagcc 
acggatagca 
acagcacaac 
gtcactccag 
gtgatcgaga 
actggttcgg 
aaatttcgtt 
aaagtcatca 
actattcagg 
gaagggccaa 



ttcatgttct 
cagcagcaag 
gaggtattaa 
ttggcaacgg 
caaacaactg 
tattcgcttc 
atacgaacgc 
actttgtcag 
agccggatta 
aggagaatgc 
tgaaaaacat 
tgggagctca 
agggagccgg 
cgtcggaccg 
aaggagattt 
agaacgggac 
caggctatta 
ataaaggtga 
aaaaattcgt 
gccgaaacct 
tgccagccca 
gcagcggaac 
ctctctaccc 
ccattcaatg 
acgcccagga 
gcaacgaacc 
tccccatgaa 
atattgataa 



gttggcctgt 
tgatgccaac 
ccacccagcc 
agcgaaccag 
gtacagggag 
tccctggaat 
caaacgattg 
ttatatgaaa 
tgcccatgaa 
gggatccatt 
gtctgacccg 
tacgtacggg 
caaagaactg 
ttggcctgag 
tcaggcttac 
tattagcaaa 
ccgtgtcgat 
taataaggca 
tcttcagaat 
ggcaagcgga 
aagcgtaaca 
cacgtacgag 
gggatacact 
gaatgccatc 
aagcggaacg 
tttcccggca 
cgcgggaacc 
catcaatgtc 



<210> 242 
<211> 564 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(28) 

<400> 242 _ . • f n 

val Lys lie Leu Lys Phe Lys Met Asn Leu Lys Lys Ser val His Val 

1 5 10 15 

Leu Leu Ala Cys Leu Thr Ala Leu Pro Leu Met Leu Thr Pro Thr His 
20 25 30 
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val Ser Ala Ala Ser Asp Ala Asn He Asn Leu ser Ser Glu Lys Gin 

35 40 45 

Leu He Lys Gly Phe Gly Gly lie Asn His Pro Ala Trp lie Gly Asp 

50 55 60 

Leu Thr Ala Ala Gin Arg Glu Thr Ala Phe Gly Asn Gly Ala Asn Gin 
65 70 75 80 

Leu Gly Phe Ser lie Leu Arg lie Tyr Val Asp Glu Asn Pro Asn Asn 

85 90 95 

Trp Tyr Arg Glu Val Ala Thr Ala Lys Arg Ala lie Glu Gin Gly Ala 

100 105 110 

lie val Phe Ala Ser Pro Trp Asn Pro Pro ser Asp Met Val Glu Thr 

115 120 125 

Phe Asn Arg Asn Gly Asp Thr Asn Ala Lys Arg Leu Arg Tyr Asp Lys 

130 135 140 

Tyr Ala Ala Tyr Ala Gin His Leu Asn Asp Phe val Ser Tyr Met Lys 
145 150 155 160 

Asn Asn Gly val Asp Leu Tyr Ala lie Ser Val Gin Asn Glu pro Asp 

, . , 165 , 170 175 

Tyr Ala His Glu Trp Thr Trp Trp Thr Pro Gin Glu lie Leu Arg Phe 

180 185 190 

Met Lys Glu Asn Ala Gly ser lie Gin Asn Thr Lys val Met Ala Pro 

195 200 205 

Glu ser Phe Gin Tyr Leu Lys Asn Met Ser Asp Pro lie Leu Asn Asp 

210 215 220 

Pro Gin Ala Leu Ala Asn Met Asp He Leu Gly Ala His Thr Tyr Gly 
225 230 235 240 

Thr Gin Phe Lys Asp Phe Ala Tyr pro Leu Phe Lys Gin Lys Gly Ala 

245 250 255 

Gly Lys Glu Leu Trp Met Thr Glu val Tyr Tyr Pro Asn ser Asp Asn 

260 265 270 

Asn Ser Ser Asp Arg Trp Pro Glu Ala Leu Asp val ser Tyr His Met 

275 280 285 

His Asn Ala Met Val Glu Gly Asp Phe Gin Ala Tyr val Trp Trp Tyr 

290 295 300 

lie Arg Arg Gin Tyr Gly Pro Met Asn Glu Asn Gly Thr lie Ser Lys 
305 310 315 320 

Arg Gly Tyr Asn Met Ala His Phe Ser Lys phe Val Arg Pro Gly Tyr 

325 330 335 

Tyr Arg Val Asp Ala Thr Lys Asn Pro Asp Thr Asn Thr Phe val Ser 

340 345 350 

Ala Tyr Lys Gly Asp Asn Lys Ala Val He Val Ala He Asn Arg Gly 

355 360 365 

Thr ser Ala val Ser Gin Lys Phe val Leu Gin Asn Gly Asn Ala Ser 

370 375 380 

Thr val Ser Ser Trp val Thr Asp Ser Ser Arg Asn Leu Ala ser Gly 
385 390 395 400 

Ala Pro lie Thr Met Ser Gly Gly Ala Phe Thr Ala Gin Leu Pro Ala 

405 410 415 

Gin ser val Thr Thr Phe val Ala Asn lie Thr Gly Gly Ser val Thr 

420 425 430 

Pro Gly Ser Gly Thr Thr Tyr Glu Ala Glu Thr Gly Thr Thr Leu Thr 

435 440 445 

Asp Ala Val lie Glu Thr Leu Tyr Pro Gly Tyr Thr Gly Thr Gly Tyr 

450 455 460 

val Asn Phe Asn Ala Tyr Thr Gly Ser Ala He Gin Trp Asn Ala He 
465 470 475 480 

Asn Asn Thr lie Thr Gly Thr Lys Asn val Lys Phe Arg Tyr Ala Gin 

485 490 495 

Glu ser Gly Thr Arg Asn Leu Asp lie Phe Val Asn Gly Thr Lys val 

500 505 510 

lie Ser Asn Glu Pro Phe Pro Ala Thr Gly ser Trp Ser Thr Trp Ser 

515 520 525 

Glu Lys Thr lie Gin val Pro Met Asn Ala Gly Thr Asn Thr lie Lys 

530 535 540 

val val Thr Thr Gly Thr Glu Gly Pro Asn He Asp Asn He Asn Val 
545 550 555 560 

Thr Ala Val Gin 
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<210> 243 
<211> 1272 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 



<400> 243 

atgatttcaa 

gtcatgttag 

ttatctgcag 

ggagatttga 

ttttcaatct 

actgcaaaga 

agcgatatgg 

gataagtacg 

ggcgtgaatc 

tggtggactc 

cgtgtcattg 

gatccacagg 

agtcagtttc 

gaagtatact 

gtttcagagc 

tacatccgca 

aatatggctc 

aatcctaatg 

gccattaaca 

tctcaggtat 

aatgtaacgg 

gcaaatcgtt 



gcgtaaaaaa 
ccgggccagg 
aaaaacaagt 
cag cage tea 
taagaattca 
gtgcgatcaa 
ttgagacttt 
ccgcatacgc 
tttatgegat 
cgcaagaaat 
caccagaatc 
cgcttaggaa 
cttatcctct 
atccaaacag 
atattcacca 
gatcttaegg 
atttctcgaa 
egaaegttta 
aaagcaatac 
ctaggtggat 
gcaatcattt 
aa 



accaatttgt 
tgctactgaa 
gatccgeggt 
aagagaaacc 
tgtggatgaa 
acatggagca 
caategtaat 
gcagcatctt 
ttctgttcaa 
acttegttte 
ttttcagtac 
tatggatatt 
attcaaacaa 
tgacaacaat 
ttcaatggtg 
tcctatgaaa 
gtttgtgcgt 
cgtgtcagcc 
aggggtcaac 
aacaagegga 
ttgggcccat 



gtattattgg 
gttttagcag 
tttggaggca 
gcttttggca 
aatagaaata 
atcgtttttg 
ggtgacacat 
aacgattttg 
aacgagectg 
atgagagaaa 
tttaaaaata 
ctcggaactc 
aaaggagcag 
teageggate 
gagggagatt 
gaggaeggta 
cccggctatg 
tataaaggtg 
caaaactttg 
agcagcaatc 
cttccagctc 



tatgtttcac 
caagtgatgt 
tgaaccaccc 
atggacagaa 
attggtacag 
cttctccctg 
cagctaaacg 
ttacctacat 
attatgegea 
atgccggttc 
tatcggaccc 
acctgtaegg 
ggaaagagct 
gctggcccga 
ttcaatctta 
cgatcagcaa 
taagggtaga 
acaacaaggt 
tgttgcagaa 
ttcaacctgg 
aaagcgtgac 



tatgetgtea 
aacaattaat 
ggcttggatt 
tcagttaggt 
agaagtggag 
gaatcctcca 
gctaagatac 
gaagaataat 
cgaatggacg 
cattaatgea 
cattttgaac 
tactcaggtc 
atggatgacg 
ggcattaggc 
tgtttggtgg 
aegeggttae 
tgcaacgaaa 
cgttattgtt 
tggatctget 
aacgaatctc 
aacatttgtc 



<210> 244 
<211> 423 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (l)..-(33) 

<400> 244 

Met lie ser ser val 
1 5 
Thr Met Leu Ser Val 
20 

Ala Ala ser Asp val 
35 

Arg Gly Phe Gly Gly 
50 

Ala Ala Gin Arg Glu 
65 

Phe Ser lie Leu Arg 
85 

Arg Glu Val Glu Thr 
100 

Phe Ala Ser Pro Trp 
115 

Arg Asn Gly Asp Thr 
130 

Ala Tyr Ala Gin His 
145 

Gly val Asn Leu Tyr 
165 

His Glu Trp Thr Trp 
180 

Glu Asn Ala Gly ser 



an environmental sample 



Lys 


Lys 


Pro 


He 


cys 


val 


Leu 


Leu 


val 


Cys 


Phe 






10 










15 




Met 


Leu 


Ala 


Gly 


Pro 


Gly 


Ala 


Thr 


Glu 


Val 


Leu 








25 








30 




He 


Thr 


lie 


Asn 


Leu 


Ser 


Ala 


Glu 


iV 


Gin 


val 




40 














Thr 


Met 


Asn 


His 


pro 


Ala 


Trp 


He 


Gly 


Asp 


Leu 




55 








60 




Gin 




Gly 


Thr 


Ala 


Phe 


Gly 


Asn 


Gly 


Gin 


Asn 


Leu 


70 








75 










80 


lie 


His 


Val 


Asp 


Glu 


Asn 


Arg 


Asn 


Asn 


Trp 


Tyr 








90 






Ala 


95 




Ala 


Lys 


Ser 


Ala 


He 


Lys 


His 


Gly 


He 


Val 






105 








110 






Asn 


Pro 


pro 


ser 


Asp 


Met 


val 


Glu 


Thr 


Phe 


Asn 






120 








125 






Ala 


ser 


Ala 


Lys 


Arg 


Leu 


Arg 


Tyr 


Asp 


Lys 


Tyr 




135 




140 










Leu 


Asn 


Asp 


Phe 


val 


Thr 


Tyr 


Met 


Lys 


Asn 


Asn 


150 








155 








160 


Ala 


He 


Ser 


Val 


Gin 


Asn 


Glu 


pro 


Asp 


Tyr 
175 


Ala 










170 








Phe 




Trp 


Thr 


pro 


Gin 


Glu 


He 


Leu 


Arg 


Met 


Arg 






185 










190 




Phe 


He 


Asn 


Ala 


Arg 


val 


He 


Ala 


Pro 


Glu 


Ser 
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195 200 205 

Asn Asp 
220 



Gin 


Tyr 


Phe 


Lys 


Asn 


lie 


ser 


Asp 


Pro 


lie 


Leu 




210 


















Leu 


Arg 


Asn 


Met 


Asp 


He 


Leu 


Gly 


Thr 


Hi s 


Leu 


225 


Gin 


Phe 




230 








73S 

CD J 


ser 


Pro 


Tvr 


Pro 


Leu 


Php 

nic 


Lvs 


Gl n 

\3 1 1 1 


Lys 








Thr 


245 








7sn 


Leu 


Ttd 


Met 


Glu 


val 


TVP 
i y i 


Tup 
i y i 




Acn 
nol 1 


Ser 








7 fin 






7fiS 






Asp 


Arfl 


Trn 
775 


Pro 


Gl U 


Ala 

Mid 




uiy 


Vo 1 




fi \ 1 1 

V3 1 U 


Met 


Val 

w CI 1 


Glu 


gIv/ 

vj i y 


Acn 
M j \J 


r IIC 


O t II 


Cor 


ryr 


Val 


Trp 














j 








Tyr 


V3 i y 


Drn 




Lys 


fll it 
va 1 U 


A5p 


\j ly 


Tnr 


i le 






Ala 

r\ • CI 






3in 

JlX) 








Asn 




n I o 


Php 




Lys 


Phe 


va 1 


Arg 


Pro 




Ala 


Thr 












330 




Asp 


Lys 


Acn 


r r o 


Asn 


a i a 


Asn 


va i 


Tyr 


Gly 




















Asp 


Asn 


Lys 


Val 


val 


He 


Val 


Ala 


lie 


Asn 


val 




355 








360 








Asn 


Gin 


Asn 


Phe 


val 


Leu 


Gin 


Asn 


Gly 


Ser 




370 


He 








375 








Arg 


Trp 


Thr 


Ser 


Gly 


ser 


ser 


Asn 


Leu 


Gin 


385 








390 










395 


Asn 


Val 


Thr 


Gly 


Asn 


His 


Phe 


Trp 


Ala 


His 


Leu 










405 








410 




Thr 


Thr 


Phe 


Val 


Ala 


Asn 


Arg 














420 















285 
Trp Tyr 
300 



365 
Ala ser 
380 



Pro 


Gin 


Ala 


i ill 


Va 1 n 


va i 


Gly 




240 


Lys 


Glu 




255 




Asn 


Ser 


Ala 


270 






His 


His 


ser 


He 


Arg 


Arq 


Arg 


Gly 


Tyr 






320 


val 


Arg 


Val 


Ala 


335 




Tyr 


Lys 


350 




Asn 


Thr 


Gly 


Gin 


Val 


ser 


Thr 


Asn 


Leu 






400 


Gin 


ser 


Val 




415 





<210> 245 
<211> 1263 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 245 

atgtcaatga tcaaaaaacc aatctgeact ttattgatct gcttcaccat gctgtctgtc 60 

atgttcatcg gacctggcgt gactgaggtt tcagcagcag atgecaatat taatatcaat 120 

geggaaagae aagtgattcg eggctttgge ggaatgaacc ateeggcttg gattggtgat 180 

ttgaccgcac ctcaaaggga aaccgccttt ggcaatgggc agaatcaatt aggattttcc 240 

attctaagaa tttttgtaga tgagaaccga aataattggc acagagaggt cgctactgcc 300 

aaaagagcaa tagagcatgg agctttggtg atcgcttcac catggaatcc tccaagcaat 360 

atggtagaga ccttcaaccg gaatggtaca tetgeaaage ggctcagata caaccaatac 420 

geegcatatg ctcagcatct gaacgatttt gtgaegtata tgaaaaataa tggcgtcaat 480 

etctatgeta tatctgtaca aaatgagece gattatgeae acgaatggac atggtggact 540 

cctcaggaaa tcctgcgttt catgagagaa aatgctggct ecattaatge ccgcgtgatc 600 

gcaccagaat cctttcaata ccttaaaaat atatcagatc ctatcctaaa cgatccgcag 660 

gcgcttggaa acatggacat tctcggagcc catttgtacg gaacccaaat cagccagctt 720 

ccgtatcctc ttttcaaaca aaagggaggg ggaaaggagc tttggatgac agaggtctac 780 

tacccgaata gcgataacaa ttcageggae cgctggcctg aagcattagg ggtttcagag 840 

catattcacc attcgatggt agaaggggac tttcaggcat atgtttggtg gtacattege 900 

agatcctacg gecctatgaa agaagaeggt ctaatcagca aacgtggtta caacatggcg 960 

catttctcca agtttgtacg cccaggatac atcagaattg atgcaacgaa aagtcctgaa 1020 

ccgaatgttt tegtatcage ctataaagga aacaatcaag tegtcattgt cgegattaac 1080 

aaaaacaata caggagtcaa tcagcacttt gtgatgcaaa aeggaactge ttcacaagcg 1140 

tcaagatgga tcacaagtag caacagcaac cttcagcctg gtacagactt aaatatatca 1200 

ggtaatcaat tttgggctca tctcccggct caaagtgtga caacatttgt ggtcaaaege 1260 

ta 9 " 1263 

<210> 246 
<211> 401 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
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<221> SIGNAL 
<222> (1)...(32) 

<400> 246 

Met Ser Met He Lys Lys Pro lie Cys Thr Leu Leu lie Cys Phe Thr 

15 10 15 

Met Leu ser Val Met Phe lie Gly Pro Gly val Thr Glu Val ser Ala 

20 25 30 

Ala Asp Ala Asn lie Asn lie Asn Ala Glu Arg Gin Val lie Arg Gly 

35 40 45 

Phe Gly Gly Met Asn His Pro Ala Trp lie Gly Asp Leu Thr Ala Pro 

50 55 60 

Gin Arg Glu Thr Ala Phe Gly Asn Gly Gin Asn Gin Leu Gly phe Ser 
65 70 75 80 

lie Leu Arg lie Phe Val Asp Glu Asn Arg Asn Asn Trp His Arg Glu 

85 90 95 

val Ala Thr Ala Lys Arg Ala He Glu His Gly Ala Leu Val He Ala 

100 105 110 

Ser Pro Trp Asn Pro pro Ser Asn Met Val Glu Thr Phe Asn Arg Asn 

, , 115 120 125 

Gly Thr ser Ala Lys Arg Leu Arg Tyr Asn Gin Tyr Ala Ala Tyr Ala 

130 135 140 

Gin His Leu Asn Asp Phe Val Thr Tyr Met Lys Asn Asn Gly val Asn 
145 150 155 160 

Leu Tyr Ala lie Ser val Gin Asn Glu Pro Asp Tyr Ala His Glu Trp 

165 170 175 

Thr Trp Trp Thr Pro Gin Glu lie Leu Arg phe Met Arg Glu Asn Ala 

180 185 190 

Gly ser lie Asn Ala Arg val He Ala Pro Glu Ser phe Gin Tyr Leu 

195 200 205 

Lys Asn lie ser Asp Pro lie Leu Asn Asp Pro Gin Ala Leu Gly Asn 

210 215 220 

Met Asp lie Leu Gly Ala His Leu Tyr Gly Thr Gin lie ser Gin Leu 
225 230 235 240 

Pro Tyr Pro Leu Phe Lys Gin Lys Gly Gly Gly Lys Glu Leu Trp Met 

n , 245 250 2 " 

Thr Glu Val Tyr Tyr Pro Asn Ser Asp Asn Asn ser Ala Asp Arg Trp 

260 265 270 

pro Glu Ala Leu Gly val ser Glu His He His His Ser Met val Glu 

275 280 285 

Gly Asp Phe Gin Ala Tyr val Trp Trp Tyr lie Arg Arg ser Tyr Gly 

290 295 300 

Pro Met Lys Glu Asp Gly Leu lie Ser Lys Arg Gly Tyr Asn Met Ala 
305 310 315 320 

His Phe Ser Lys Phe Val Arg Pro Gly Tyr He Arg lie Asp Ala Thr 

325 330 335 

Lys ser pro Glu Pro Asn val Phe val ser Ala Tyr Lys Gly Asn Asn 

340 345 350 

Gin val Val lie Val Ala lie Asn Lys Asn Asn Thr Gly Val Asn Gin 

355 360 365 

His Phe Val Met Gin Asn Gly Thr Ala Ser Gin Ala Ser Arg Trp lie 

370 375 380 

Thr ser ser Asn ser Asn Leu Gin Pro Gly Thr Asp Leu Asn lie Ser 
385 390 395 400 

Gly 

<210> 247 
<211> 1044 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 247 

gtgtttgcca acgatttcct gataggcgtg gcgctcaact cacggcaggt cgccggggaa 60 
tccgaggccg gaaaactagc tggcgcgcaa ttttcgtcgg tgacggcgga gaatgagatg 120 
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aagtggcagt cgctccatcc ccagcccgac cgctatcagt tcggcgcggc ggactcctac 180 

atcgattttg ccaaaaaaca caagatggcg gtgatcggcc acacgctcgt gtggcacagc 240 

cagacacccg gctgggtgtt cgagggcaag gacggcaagc cggcgacccg cgaggatctg 300 

ctcaagcgca tgcgcgatca catccacacc gtggccggac gctacaaggg caaggtgcgc 360 

ggctgggacg tggtcaacga ggccttgtcc gacggcggtc ccgaaatcct gcgggattct 420 

ccgtggcggc gcatcatcgg cgatgacttc atcgaccacg cgttccgttt cgcccgtgag 480 

gccgatccga aagccgaact ctactacaac gactacggtc tcgagaacga aaggaagcgg 540 

agcaactgca tcaagctcgt caagggcatg aaacaacgcg gcgtgccgat cgacggggtg 600 

ggcacccagt cgcatttcca cttgaaacat ccctcgctcc aggaaatcga aaagaccatc 660 

aaggactttt ccgaactcgg actcaaggtg atgatcaccg agctggatgt cgatgtgctg 720 

ccgtcgcgtg gcaatttcgg caacgccgac atcagccgcc gcgagcaggg cggtgacgca 780 

ctcaatcctt acaccggcgg cttgcccgat gaggtccaac aggaacttgc gaaacgctat 840 

gcggacattt ttgatatcta tctgcgccac cggaaggcgg tcacccgcgt aaccttctgg 900 

ggactcgatg acgggcatac ctggttgaac ggtttcccga tccgcggacg caccaactat 960 

ccgctgttgt tcgaccgcgc cctcaagccg aagcccgcgt tcgaggcggt catcaaaaaa 1020 

gggcttgaac ccaggaaacg ttga " " ^ 1044 

<210> 248 
<211> 347 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 248 

val Phe Ala Asn Asp Phe Leu lie Gly Val Ala Leu Asn ser Arq Gin 

15 10 15 

val Ala Gly Glu Ser Glu Ala Gly Lys Leu Ala Gly Ala Gin Phe Ser 

20 25 30 

Ser val Thr Ala Glu Asn Glu Met Lys Trp Gin Ser Leu His Pro Gin 

35 40 45 

Pro Asp Arg Tyr Gin Phe Gly Ala Ala Asp Ser Tyr He Asp Phe Ala 

50 55 60 

Lys Lys His Lys Met Ala Val lie Gly His Thr Leu val Trp His Ser 
5 , 70 75 80 

Gin Thr Pro Gly Trp Val Phe Glu Gly Lys Asp Gly Lys pro Ala Thr 

85 90 95 

Arg Glu Asp Leu Leu Lys Arg Met Arg Asp His lie His Thr val Ala 

100 105 no 

Gly Arg Tyr Lys Gly Lys val Arg Gly Trp Asp val Val Asn Glu Ala 

115 120 125 

Leu Ser Asp Gly Gly pro Glu He Leu Arg Asp ser Pro Trp Arq Arq 

130 135 ~ 140 

lie He Gly Asp Asp Phe He Asp His Ala Phe Arg Phe Ala Arg Glu 
145 150 155 y 160 

Ala Asp pro Lys Ala Glu Leu Tyr Tyr Asn Asp Tyr Gly Leu Glu Asn 

165 170 175 

Glu Arg Lys Arg Ser Asn cys He Lys Leu val Lys Gly Met Lys Gin 

'18° 185 190 

Arg Gly Val Pro He Asp Gly Val Gly Thr Gin ser His phe His Leu 

. 195 200 205 

Lys his Pro ser Leu Gin Glu He Glu Lys Thr lie Lys Asp Phe Ser 
, 210 215 220 

Glu Leu Gly Leu Lys val Met lie Thr Glu Leu Asp Val Asp Val Leu 
225 230 235 240 

Pro ser Arg Gly Asn Phe Gly Asn Ala Asp lie Ser Arg Arg Glu Gin 

245 250 255 

Gly Gly Asp Ala Leu Asn Pro Tyr Thr Gly Gly Leu Pro Asp Glu val 

260 265 270 

Gin Gin Glu Leu Ala Lys Arg Tyr Ala Asp He Phe Asp lie Tyr Leu 

. 275 280 285 

Arg His Arg Lys Ala val Thr Arg val Thr Phe Trp Gly Leu Asp Asp 

_ 290 295 300 

Gly His Thr Trp Leu Asn Gly Phe pro lie Arg Gly Arg Thr Asn Tyr 
305 310 315 320 

Pro Leu Leu Phe Asp Arg Ala Leu Lys Pro Lys Pro Ala Phe Glu Ala 

325 330 335 

val lie Lys Lys Gly Leu Glu Pro Arg Lys Arg 

Page 180 



WO 03/106654 



PCT/US03/19153 



340 



345 



<210> 249 
<211> 1439 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample 



<400> 249 

tgatcaatcc 

tctacgagat 

tccagatact 

gggatgaaaa 

tcgcaaaaag 

ggtgggcaga 

ttctggaaag 

gtgcactgcc 

atggaatgat 

cggccatcaa 

aaggtggaaa 

attttgatgt 

agaacaacct 

cgtatgcctg 

tggagctcac 

tcatagaagt 

gagactggat 

atcaggccat 

tgaggacagt 

cgaatttggg 

ccatcagatc 

tctacagagt 

tgaagggaag 

ggaaggtgtt 



agtgaaggat 
agagcggctc 
gaaggatcat 
cggaaatccg 
ggccaaaaag 
tcccggcaag 
ggcggtgtat 
ggacatggtc 
tgccggaaag 
agccgtcagg 
caactcactt 
gatcggtgta 
gtacgacata 
gacactcgag 

gggtggttac 
ggtgaacagt 
tcctgtgaaa 
gtttgatttc 
cactcctatg 
agagattccg 
cctgaaagtt 
ggaaggatac 
tagaaactac 
cggtaacgga 



cttcgtgaag 
ggtggtaagt 
gagataaact 
ctcggtgggg 
tacggaatga 
cagtacaaac 
tcctacacga 
caggtgggaa 
gatgcaggag 
gaagttgatc 
ttcaggtggt 
tcgtactatc 
gcgaaaagat 
gacggggacg 
aaagcaacgg 
gttcctgacg 
ggagccggct 
aatggaaatg 
gaaataaaaa 
aagtttccgg 
acatggaatt 
gtggaaagta 
ctgaagaacc 
aaacgcagtg 



atttcatctt 
tcttcgatgg 
ggatcagatt 
gaaactgtga 
aggttcttct 
caaaagagtg 
aactcgtgct 
acgaggtgaa 
gattcgacgg 
ccgatatcaa 
tcttcgacga 
cgtactggca 
acaacaaaga 
gttaccccaa 
ttcagggaca 
gtcacggtct 
ggaaaaccgg 
ctctcccatc 
tcgaagagat 
atgctgtgaa 
ttgatccttc 
tagaccagaa 
ctgggtttga 
aaggtggtaa 



tggaatggac 
tggtgtggag 
gagagtgtgg 
ttatctgaaa 
tgactttcac 
ggatcacctt 
gaatcatatg 
caacggcttt 
attcacaaaa 
gatagtcatt 
gatcacaaga 
tggtaccctg 
cgtgctcatc 
catcttcagt 
ggcaacgttc 
tgggatcttc 
cgaaggaaat 
cctggatgtt 
tctgcctgtg 
agtgctgttc 
tcttgttgaa 
gatcttcgca 
aacgggtgag 
aggccgatcc 



gtttcaatgc 
aaagatcttt 
aacgatccaa 
atgacagaga 
tacagcgact 
catggagaac 
agaagaaacg 
ctctggccgg 
cttttgaagg 
catttggcag 
agagacgtgg 
gatgacctga 
gttgaaacgg 
ggtgaagaga 
ttgagggatc 
tattgggaag 
ccatgggaga 
ttcaagctcg 
gagatctcga 
agcgatgatt 
acacccggtg 
accttgactg 
ttttctccct 
tccgagtaa 



<210> 250 
<211> 479 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (1)... (33) 

<400> 250 

Met lie Asn Pro Val 
1 5 
Asp Val Ser Met Leu 
20 

Asp Gly Gly val Glu 
35 

lie Asn Trp lie Arg 
50 

Gly Asn Pro Leu Gly 

lie Ala Lys Arg Ala 
85 

His Tyr Ser Asp Trp 
100 

Glu Trp Asp His Leu 
115 

Tyr Thr Lys Leu Val 
130 

Asp Met val Gin val 
Asp Gly Met lie Ala 



an environmental sample 



Lys Asp 

Tyr Glu 

Lys Asp 

Leu Arg 

55 
Gly Gly 
70 

Lys Lys 

Trp Ala 

His Gly 

Leu Asn 
135 
Gly Asn 
150 

Gly Lys 



Leu Arg Glu Asp Phe lie 
10 

Arg Leu Gly Gly 



lie Glu 

25 
Leu Phe 
40 

Val Trp 

Asn Cys 

Tyr Gly 

Asp Pro 
105 
Glu Leu 
120 

His Met 
Glu val 
Asp Ala 



Phe Gly Met 
15 

Lys Phe Phe 
30 

Asp His Glu 



Gin lie Leu Lys 

Asn Asp Pro Arg Asp Glu Asn 
60 

Asp Tyr Leu Lys 
75 

Met Lys val Leu 
90 

Gly Lys Gin Tyr 



Met Thr Glu 
80 

Leu Asp Phe 
95 

Lys Pro Lys 
110 

Val Tyr Ser 



Leu Glu Arg Ala 
125 

Arg Arg Asn Gly Ala Leu Pro 
140 

Asn Asn Gly Phe Leu Trp Pro 
155 160 
Gly Gly Phe Asp Gly Phe Thr 
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165 170 175 

Lys Leu Leu Lys Ala Ala lie Lys Ala val Arg Glu Val Asp Pro Asp 

180 185 190 

lie Lys lie val lie His Leu Ala Glu Gly Gly Asn Asn ser Leu Phe 

195 200 205 

Arg Trp Phe Phe Asp Glu lie Thr Arg Arg Asp Val Asp Phe Asp Val 

210 215 220 

lie Gly Val Ser Tyr Tyr Pro Tyr Trp His Gly Thr Leu Asp Asp Leu 
225 230 235 240 

Lys Asn Asn Leu Tyr Asp lie Ala Lys Arg Tyr Asn Lys Asp val Leu 

245 250 255 

lie Val Glu Thr Ala Tyr Ala Trp Thr Leu Glu Asp Gly Asp Gly Tyr 

260 265 270 

Pro Asn lie Phe ser Gly Glu Glu Met Glu Leu Thr Gly Gly Tyr Lys 

275 280 285 

Ala Thr val Gin Gly Gin Ala Thr Phe Leu Arg Asp Leu lie Glu val 

290 295 300 

Val Asn Ser Val Pro Asp Gly His Gly Leu Gly lie Phe Tyr Trp Glu 
305 310 315 320 

Gly Asp Trp lie Pro val Lys Gly Ala Gly Trp Lys Thr Gly Glu Gly 

325 330 335 

Asn Pro Trp Glu Asn Gin Ala Met Phe Asp Phe Asn Gly Asn Ala Leu 

340 345 350 

Pro Ser Leu Asp Val Phe Lys Leu Val Arg Thr val Thr Pro Met Glu 

355 360 365 

lie Lys lie Glu Glu lie Leu Pro val Glu lie Ser Thr Asn Leu Gly 

370 375 , 380 

Glu lie Pro Lys Phe Pro Asp Ala val Lys val Leu Phe Ser Asp Asp 
385 390 395 400 

ser lie Arg ser Leu Lys Val Thr Trp Asn Phe Asp Pro ser Leu val 

405 410 415 

Glu Thr Pro Gly val Tyr Arg Val Glu Gly Tyr val Glu Ser lie Asp 

420 ~ 425 430 

Gin Lys lie Phe Ala Thr Leu Thr val Lys Gly Ser Arg Asn Tyr Leu 

435 440 445 

Lys Asn Pro Gly Phe Glu Thr Gly Glu Phe Ser Pro Trp Lys val Phe 

450 455 460 

Gly Asn Gly Lys Arg Ser Glu Gly Gly Lys Gly Arg Ser ser Glu 
465 470 475 

<210> 251 
<211> 555 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 251 

atggctacgg attattggca atattggacg gatggcggcg gaacggtgaa tgcggttaac 60 

gggtccgggg gcaattacag cgtaacttgg caaaatagcg gggacttcgt ggtcggcaaa 120 

ggctggagcg tagggtcgcc aaatcggacg atcaattaca atgccggcat ctgggaacct 180 

tcggggaacg ggtacttgac cctttacgga tggactagaa actcgctgat cgagtattac 240 

gttgtcgaca gttgggggac gtaccggcca acaggtactc acaaaggaac ggtgaacagc 300 

gacggaggca cctacgatat ttatacgacc atgcgctata atgcgccttc cattgatggc 360 

acgcagacgt tccaacagtt ctggagcgtg cggcaatcga aacgaccaac cggcagcaac 420 

gtctccatca ccttcagcaa tcacgtgaat gcctggagaa gcaagggcat gaacctgggc 480 

agcagctggt cgtaccaggt cttggcgacg gaaggctatc agagcagcgg aagatccaac 540 

gtcacggtgt ggtaa 555 

<210> 252 
<211> 184 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 252 
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Met Ala Thr Asp Tyr Trp Gin Tyr Trp Thr Asp Gly Gly Gly Thr Val 

1 , n 5 10 15 

Asn Ala val Asn Gly ser Gly Gly Asn Tyr ser Val Thr Trp Gin Asn 

20 25 30 

Ser Gly Asp phe val val Gly Lys Gly Trp ser Val Gly Ser Pro Asn 

35 40 45 

Arg Thr lie Asn Tyr Asn Ala Gly He Trp Glu Pro Ser Gly Asn Gly 

50 55 60 

Tyr Leu Thr Leu Tyr Gly Trp Thr Arg Asn Ser Leu lie Glu Tyr Tyr 
65 70 75 80 

Val Val Asp ser Trp Gly Thr Tyr Arg Pro Thr Gly Thr His Lys Gly 

85 90 95 

Thr Val Asn ser Asp Gly Gly Thr Tyr Asp lie Tyr Thr Thr Met Arg 

100 105 110 

Tyr Asn Ala Pro Ser lie Asp Gly Thr Gin Thr Phe Gin Gin phe Trp 

115 120 125 

ser val Arg Gin ser Lys Arg Pro Thr Gly Ser Asn Val ser lie Thr 

130 135 140 

Phe ser Asn His val Asn Ala Trp Arg ser Lys Gly Met Asn Leu Gly 
145 150 155 160 

Ser Ser Trp ser Tyr Gin Val Leu Ala Thr Glu Gly Tyr Gin ser ser 

165 170 175 

Gly Arg ser Asn val Thr Val Trp 
180 

<210> 253 
<211> 1047 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 253 

atgattgtta gcttcaagag cctgaaggca ctcgcgtgcc tcggcgtgct cggcatcacc 60 

gccgcgcacg cgcaaacctg catcacgtcg agccagacgg gcaccaacaa cggcaattac 120 

ttttcgttct ggaaagacag tccgggcacg gtgaacttct gcatgtatgc gaatggccgc 180 

tatacctcca actggagcgg catcaacaac tgggtgggcg gcaagggctg ggctaccggc 240 

tccagccaca cgatcagcta ctccggcacg ttcaattcgc cgggcaacgg ttacctggcc 300 

ctgtatggct ggaccaccaa tccattggtc gagtactaca tcgtcgacag ctggggtacc 360 

taccgtccgc cgggcggcca gggtttcatg ggcacggtag ttagcgacgg gggcacgtac 420 

gacgtgtacc ggacgcaacg cgtgaaccag ccatccatca tcggcaacgc cacgttctac 480 

cagtactgga gcgtgcggca gtcgaagcgc gtgggcggca ccatcaccat cgccaaccat 540 

ttcaacgcct gggccacgct gggcatgaac ctgggccagc acaactacca ggtcatggcc 600 

accgagggtt accagagcag cggcagctcc gacatcaccg tgaccgaagg tggcggcagy 660 

tcctcgtcgt cctcgggcgg cggcagcacc agcagcagtg gtggcggcgg caacaagagc 720 

ttcacggtgc gtgcgcgcgg cacggccgga ggcgagaaca tccagctgca ggtgaacaac 780 

cagacggtcg cgagctggaa cctcaccacc agcatgcaga actacaccgc ctcgaccagc 840 

ctgagcggcg gcatcaccgt gctctacacc aacgacggcg gcagccgcga cgtgcaggtg 900 

gactacatca tcgtgaacgg ccagacccgc cagtccgaag cgcagagcta caacaccggg 960 

ttgtatgcga atggacgctg cggcggtggc tcgaacagcg agtggatgca ttgcaacggc 1020 

gcgatcggct acggcaatac gccctga " ' 1047 

<210> 254 
<211> 347 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(24) 

<400> 254 

Met lie Val Ser Phe Lys ser Leu Lys Ala Leu Ala cys Leu Gly val 
15 10 15 

Leu Gly He Thr Ala Ala His Ala Gin Thr Cys He Thr ser ser Gin 
20 25 30 
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Thr Gly Thr Asn Asn Gly Asn Tyr Phe Ser Phe Trp Lys Asp ser Pro 

35 40 45 

Gly Thr val Asn Phe Cys Met Tyr Ala Asn Gly Arg Tyr Thr Ser Asn 

50 55 60 

Trp Ser Gly lie Asn Asn Trp val Gly Gly Lys Gly Trp Ala Thr Gly 
65 70 75 80 

Ser Ser His Thr He ser Tyr ser Gly Thr Phe Asn Ser Pro Gly Asn 

85 90 95 

Gly Tyr Leu Ala Leu Tyr Gly Trp Thr Thr Asn Pro Leu Val Glu Tyr 

100 105 110 

Tyr He val Asp ser Trp Gly Thr Tyr Arg Pro Pro Gly Gly Gin Gly 

115 120 ~ 125 

Phe Met Gly Thr Val Val ser Asp Gly Gly Thr Tyr Asp Val Tyr Arg 

u 130 135 140 

Thr Gin Arg Val Asn Gin Pro ser lie lie Gly Asn Ala Thr Phe Tyr 
145 150 155 160 

Gin Tyr Trp ser Val Arg Gin Ser Lys Arg Val Gly Gly Thr lie Thr 

, , . I 65 . 170 175 

He Ala Asn His Phe Asn Ala Trp Ala Thr Leu Gly Met Asn Leu Gly 

180 185 190 

Gin His Asn Tyr Gin val Met Ala Thr Glu Gly Tyr Gin ser Ser Gly 

195 200 205 

Ser Ser Asp lie Thr val Thr Glu Gly Gly Gly ser ser ser ser ser 

210 215 220 

Gly Gly Gly ser Thr Ser Ser Ser Gly Gly Gly Gly Asn Lys ser Phe . 
225 230 235 240 

Thr val Arg Ala Arg Gly Thr Ala Gly Gly Glu Asn lie Gin Leu Gin 

245 250 255 

Val Asn Asn Gin Thr Val Ala ser Trp Asn Leu Thr Thr ser Met Gin 

260 265 270 

Asn Tyr Thr Ala Ser Thr Ser Leu Ser Gly Gly He Thr val Leu Tyr 

275 280 285 

Thr Asn Asp Gly Gly Ser Arg Asp Val Gin Val Asp Tyr lie lie val 

290 295 300 

Asn Gly Gin Thr Arg Gin ser Glu Ala Gin Ser Tyr Asn Thr Gly Leu 
305 310 315 320 

Tyr Ala Asn Gly Arg cys Gly Gly Gly ser Asn Ser Glu Trp Met His 

325 330 335 

Cys Asn Gly Ala lie Gly Tyr Gly Asn Thr Pro 
340 345 

<210> 255 
<211> 1137 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
<400> 255 

ttgatctttt ccgtcagtgg ttccgcgtct cggcggcgcc ctggcatcca caagggggat 60 

tccatgattt tcggtctaaa gtcgatcacg ggcaggcgcg ccgtcgcggc gctggcctgc 120 

cttgccggcc tctacatggc gccggcgaat gcgcaaacct gcatcacgtc gagccagacg 180 

ggcaccaaca acggcaacta cttttcgttc tggaaagaca gcccgggcac ggtgaacttc 240 

tgcatgtact ccggcggccg ctacacgtcc aactggagcg gcatcaacaa ctgggtgggc 300 

ggcaagggct ggcagacggg ctcgtcccgc accgtctcct actccggcag cttcaattcg 360 

ccgggtaacg gctacctgac gctctacggc tggaccacca atccgctcat cgagtactac 420 

atcgtcgaca actggggcag ctatcgtccg ccgggtggcc agggcttcat gggcacggtg 480 

aacaccgacg gcggcacgta cgacatctat cgcacgcaac gggtcaacca gccgtcgatc 540 

atcggcaccg cgacgttcta ccagtactgg agcgtgcggc agtcgaagcg caccggcggc 600 

accatcacca cggccaacca cttcaatgcc tgggccagcc tcggcatgaa cctgggacag 660 

cacaactacc aggtgatggc caccgagggc taccagagca gcggcagctc cgacatcacg 720 

gtgtgggaag gcacgagcag cggcggaagc agcaatggcg gcagcagcaa cggcggcagc 780 

agcaatggtg gcagcggcgg cacgaagagc ttcacggtgc gcgcgcgcgg cactgcgggc 840 

ggcgagtcca tcacgctgcg ggtcaacaac cagaacgtgc agacctggac gctgggtacc 900 

agcatgcaga actacacggc ctcgacctcg ctgagcggcg gcatcacggt ggcgttcacc 960 

aacgacggcg gcagccgcga cgtgcaggtg gactacatca tcgtgaatgg ccagacccgc 1020 

cagtccgaac agcagagcta caacactggc ctctacgcca atggaagctg tggtggcggt 1080 

tcgaacagcg agtggatgca ttgcaacggc gccatcggct acggcaatac gccctga 1137 
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<210> 256 
<211> 378 
<212> PRT 
<213> Unknown 

< 220> 

<223> obtained from an environmental sample 

<221> SIGNAL 
<222> (1)...(51) 

<400> 256 , . -j 

Leu lie Phe ser val Ser Gly Ser Ala Ser Arg Arg Arg Pro Gly lie 

15 10 15 

His Lys Gly Asp Ser Met He Phe Gly Leu Lys Ser He Thr Gly Arg 

20 25 30 

Arg Ala val Ala Ala Leu Ala Cys Leu Ala Gly Leu Tyr Met Ala Pro 

35 40 45 

Ala Asn Ala Gin Thr cys lie Thr ser Ser Gin Thr Gly Thr Asn Asn 

50 55 60 

Gly Asn Tyr Phe ser Phe Trp Lys Asp Ser Pro Gly Thr Val Asn Phe 
65 70 75 80 

cys Met Tyr Ser Gly Gly Arg Tyr Thr Ser Asn Trp Ser Gly lie Asn 

85 90 95 

Asn Trp Val Gly Gly Lys Gly Trp Gin Thr Gly Ser Ser Arg Thr val 

100 105 110 

ser Tyr ser Gly ser Phe Asn ser Pro Gly Asn Gly Tyr Leu Thr Leu 

115 120 125 

Tyr Gly Trp Thr Thr Asn Pro Leu lie Glu Tyr Tyr lie Val Asp Asn 

130 135 140 

Trp Gly ser Tyr Arg Pro Pro Gly Gly Gin Gly Phe Met Gly Thr Val 
145 150 155 160 

Asn Thr Asp Gly Gly Thr Tyr Asp lie Tyr Arg Thr Gin Arg val Asn 

165 170 175 

Gin Pro ser lie lie Gly Thr Ala Thr Phe Tyr Gin Tyr Trp Ser Val 

180 185 190 

Arg Gin Ser Lys Arg Thr Gly Gly Thr lie Thr Thr Ala Asn His Phe 

195 " 200 205 

Asn Ala Trp Ala ser Leu Gly Met Asn Leu Gly Gin His Asn Tyr Gin 

210 215 220 

Val Met Ala Thr Glu Gly Tyr Gin ser ser Gly Ser ser Asp He Thr 
225 230 235 n 240 

val Trp Glu Gly Thr ser Ser Gly Gly ser ser Asn Gly Gly Ser Ser 

K 245 250 255 

Asn Gly Gly Ser Ser Asn Gly Gly Ser Gly Gly Thr Lys Ser Phe Thr 

260 265 270 

val Arg Ala Arg Gly Thr Ala Gly Gly Glu Ser lie Thr Leu Arg val 

275 280 285 

Asn Asn Gin Asn val Gin Thr Trp Thr Leu Gly Thr Ser Met Gin Asn 

290 295 300 

Tyr Thr Ala ser Thr Ser Leu Ser Gly Gly lie Thr val Ala Phe Thr 
305 310 315 320 

Asn Asp Gly Gly ser Arg Asp Val Gin Val Asp Tyr lie lie val Asn 

325 330 , 335 

Gly Gin Thr Arg Gin Ser Glu Gin Gin Ser Tyr Asn Thr Gly Leu Tyr 

340 345 350 m 

Ala Asn Gly Ser Cys Gly Gly Gly Ser Asn Ser Glu Trp Met His Cys 

355 360 365 

Asn Gly Ala lie Gly Tyr Gly Asn Thr Pro 
370 375 

<210> 257 

<211> 2694 

<212> DNA 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample 
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<400> 257 

atggctgata 

gcctatttcc 

tggaacaaca 

tgctacgact 

actcctgcca 

gtgcctaaga 

gcttccaatg 

aaggtcattg 

ttccatgagg 

ttctggtggg 

tatttcaagc 

ggtgacagca 

atcgtggccc 

attcaggctg 

ggcgaccccg 

gtatggtatc 

gatgccatga 

tcaaccatcg 

aacgccctcg 

cagcaggaca 

gccatgatga 

atggacaagg 

tatgtcatca 

agctacgaca 

tacgccacca 

aactacggcc 

tggaattttg 

gtcgatgtcg 

tacggcgcct 

ctgcagattc 

atcgtcaaca 

gcagagattg 

gtcatcatcg 

gacctccata 

aacgacattg 

gccttcacac 

aatccctacc 

aatttcaaca 

tatagaggta 

gtatatgcca 

ttcagttaca 

atcaaaatca 

aaagtggctt 

actatcactc 

acctctcctc 



tatctaccac 
tggaccagta 
cttgtgccga 
tcatccacat 
aggaatggca 
gccagggggc 
ctctggttag 
ccaccatcct 
cagcaggcaa 
gctacgacgg 
tgaaaggcgt 
gcaaatacaa 
gcgacctcta 
cctatcccaa 
gcaagatgtc 
aaggcgaaca 
gcagcgccaa 
agaatgccac 
acgccaatgc 
ttacctctga 
aagaagccgg 
acggcaatgt 
gccagggaat 
gccagaagaa 
acaaggcccg 
agctgctgct 
cacagagcag 
tacgcgccac 
gctcaggcaa 
ccacgggtga 
aggacaacgc 
atgcatggct 
gcgaatgggg 
aggacctgat 
ccaccttcta 
agcccgacct 
tgcctgacgc 
gccaatgggg 
tcaaggtgga 
acagtgagaa 
caggcatcca 
agagcgtcaa 
ggggttgtac 
ctgttcgtca 
agcgcggcat 



accagtcaca 
tggcaagaag 
gaaagtctat 
ctgtttctcg 
cgatgcgggc 
aacagatgtt 
cggcacgtgg 
caagttacag 
tgcttgcgcc 
tgccgacacc 
gaacaacctc 
ccaggacacc 
tggctacaat 
caagatggtg 
cgatgtatgg 
aggctctacc 
cgtcatcacc 
ggatgccgtg 
ccagcaatac 
gagctgctgg 
tttcggagcc 
ggatgcagca 
gtactgcatc 
cctcaccggc 
ctatgagaag 
gttcgagggc 
ttcagcctac 
cggtggcaac 
cggcacgtgg 
aagcaaccat 
gggcaactac 
taagaactta 
caccaacaac 
gttcgaattt 
ctggatggga 
ggcgctgaag 
caaggacttt 
cgaactgacc 
gctggaagaa 
ggcaacagcc 
gaaaatcaac 
ccttatcaag 
tctcagcgac 
tgacgatgga 
ctacatcctc 



gcctcgacag 
acgatttcca 
aaactcacgg 
ccagccaact 
ggtatcgtac 
acctgcacgc 
gagaacaaat 
gacgctggca 
aagcagcagg 
tacaagaaac 
atctggatgt 
gactggtacc 
gccgaccaga 
gttctgggtg 
gcgaaaggtg 
gacacgatgt 
cgcgacaagg 
aagaacatgg 
catgatgcca 
ggtcagctac 
atccgcgttc 
tggatgaatc 
ctcaacgtac 
taccattgga 
ctgtggcagc 
tataacgaga 
gatgccatca 
aatgcccagc 
gatgcaagag 
atcatcttcg 
gtcagcgatc 
aagacccacc 
gtcgatgccg 
gtcagctaca 
cttaccgacg 
atgctgcagg 
cccgaaggca 
atccacgatg 
aagcctgcca 
atcaattcca 
ctacagtgga 
cacgacgact 
cagaactacg 
atcatctaca 
aacggaaaga 



atgctgccaa 
gcgtcatggc 
gcaagtatcc 
ggattgacta 
agttgatgtg 
ccagcgagac 
ggttctacga 
ttgccgctac 
ccgactggac 
tgtggattgc 
ggaccaccca 
ctggcgacga 
acctgcagga 
aatgcggaaa 
ccaagtgggg 
gcagcgacga 
tggttatccc 
gactggggtg 
cccaggacaa 
ccaccaaggc 
ccgtgacatg 
gtgtgcatga 
accacgacac 
tcaaggccga 
agatagccca 
tgctcgatgc 
acaaatacgc 
gcaacctcat 
tgcaagaccc 
aggttcacaa 
gcaccatcag 
tcgtcagcaa 
gcggtggcaa 
tgataaagac 
gcgctccacg 
cctatcacgg 
aaatcacctc 
gagctattga 
ctggagccct 
aaaccccaca 
acatagccac 
ccacagaacc 
ccacgggcat 
atctgagcgg 
aaatcatcaa 



gaacctgtat 
caatgtcaac 
tgccatgaac 
caccgacatc 
gcatttcaat 
cacctttaag 
gcagatggac 
ctggcgacct 
caaagcatgg 
catgtacgac 
gaattataat 
gtatgttgac 
gttcagcgag 
aggtgatagc 
ccacttcatg 
ctggtggaag 
cgatgtcact 
gaacctgggg 
ctactgggga 
agagctgatg 
gtataaccac 
ggtggttgac 
gggtgccgac 
cgaaaccaac 
ggagttccgc 
caacaactcc 
ccagagcttt 
tgtcagcaca 
cttgaagaaa 
ctatccctcc 
cgaaatcaag 
gggcgctccc 
gacagactac 
catgaagcag 
cacctacccc 
cgactcttgg 
ggccacggtg 
caagaccgtc 
gtctttcaag 
gttggctttc 
caaggggagt 
ctgtagtctg 
cgaagacatt 
acagcctgta 
atag 



<210> 258 
<211> 897 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from 
<400> 258 

Met Ala Asp lie Ser 
1 5 
Lys Asn Leu Tyr Ala 
20 

Ser ser Val Met Ala 

val Tyr Lys Leu Thr 

lie His lie cys Phe 

Thr Pro Ala Lys Glu 
85 



an environmental sample 



Thr Thr Pro Val Thr Ala Ser Thr Asp Ala Ala 

10 15 
Tyr Phe Leu Asp Gin Tyr Gly Lys Lys Thr lie 

25 30 
Asn val Asn Trp Asn Asn Thr Cys Ala Glu Lys 

40 45 
Gly Lys Tyr Pro Ala Met Asn Cys Tyr Asp Phe 

55 60 
Ser pro Ala Asn Trp lie Asp Tyr Thr Asp He 
70 75 80 

Trp His Asp Ala Gly Gly He val Gin Leu Met 
90 95 
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180 
240 
300 
360 
420 
480 
540 
600 
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720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
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2160 
2220 
2280 
2340 
2400 
2460 
2520 . 
2580 
2640 
2694 
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Trp His Phe Asn Val Pro Lys ser Gin Gly Ala Thr Asp Val Thr Cys 

100 105 HO 

Thr Pro Ser Glu Thr Thr Phe Lys Ala Ser Asn Ala Leu val ser Gly 

115 120 125 

Thr Trp Glu Asn Lys Trp Phe Tyr Glu Gin Met Asp Lys Val lie Ala 

130 135 n 140 

Thr He Leu Lys Leu Gin Asp Ala Gly He Ala Ala Thr Trp Arg Pro 
145 150 155 160 

Phe His Glu Ala Ala Gly Asn Ala cys Ala Lys Gin Gin Ala Asp Trp 

165 170 175 

Thr Lys Ala Trp Phe Trp Trp Gly Tyr Asp Gly Ala Asp Thr Tyr Lys 

180 185 190 

Lys Leu Trp He Ala Met Tyr Asp Tyr Phe Lys Leu Lys Gly Val Asn 

195 200 205 

Asn Leu He Trp Met Trp Thr Thr Gin Asn Tyr Asn Gly Asp ser Ser 

210 215 220 

Lys Tyr Asn Gin Asp Thr Asp Trp Tyr Pro Gly Asp Glu Tyr Val Asp 
225 230 235 240 

lie Val Ala Arg Asp Leu Tyr Gly Tyr Asn Ala Asp Gin Asn Leu Gin 

245 250 255 

Glu Phe Ser Glu He Gin Ala Ala Tyr Pro Asn Lys Met Val val Leu 

260 265 270 

Glv Glu cys Gly Lys Gly Asp ser Gly Asp Pro Gly Lys Met Ser Asp 

275 280 285 

val Trp Ala Lys Gly Ala Lys Trp Gly His Phe Met Val Trp Tyr Gin 

290 295 300 

Gly Glu Gin Gly Ser Thr Asp Thr Met Cys Ser Asp Asp Trp Trp Lys 
305 310 315 320 

Asp Ala Met ser Ser Ala Asn val lie Thr Arg Asp Lys Val val lie 

325 330 335 

Pro Asp val Thr ser Thr lie Glu Asn Ala Thr Asp Ala Val Lys Asn 

340 345 350 

Met Gly Leu Gly Trp Asn Leu Gly Asn Ala Leu Asp Ala Asn Ala Gin 

355 360 365 

Gin Tyr His Asp Ala Thr Gin Asp Asn Tyr Trp Gly Gin Gin Asp He 

370 375 380 

Thr Ser Glu Ser cys Trp Gly Gin Leu Pro Thr Lys Ala Glu Leu Met 
385 390 395 400 

Ala Met Met Lys Glu Ala Gly Phe Gly Ala He Arg Val Pro Val Thr 

405 410 415 

Trp Tyr Asn His Met Asp Lys Asp Gly Asn Val Asp Ala Ala Trp Met 

420 425 n 430 

Asn Arg Val His Glu Val val Asp Tyr val lie ser Gin Gly Met Tyr 

435 440 445 

Cys He Leu Asn val His His Asp Thr Gly Ala Asp ser Tyr Asp Ser 

450 455 460 

Gin Lys Asn Leu Thr Gly Tyr His Trp He Lys Ala Asp Glu Thr Asn 
465 470 475 480 

Tyr Ala Thr Asn Lys Ala Arg Tyr Glu Lys Leu Trp Gin Gin lie Ala 

485 490 495 

Gin Glu Phe Arg Asn Tyr Gly Gin Leu Leu Leu Phe Glu Gly Tyr Asn 

500 505 510 

Glu Met Leu Asp Ala Asn Asn Ser Trp Asn Phe Ala Gin ser Ser Ser 

515 520 525 

Ala TVr asp Ala lie Asn Lys Tyr Ala Gin ser Phe val Asp Val Val 

530 535 540 

Arg Ala Thr Gly Gly Asn Asn Ala Gin Arg Asn Leu He val ser Thr 
545 550 555 560 

Tvr Gly Ala Cys Ser Gly Asn Gly Thr Trp Asp Ala Arg val Gin Asp 
y y 565 570 . 575 

Pro Leu Lys Lys Leu Gin lie Pro Thr Gly Glu ser Asn His He He 

580 585 590 

Phe Glu val His Asn Tyr Pro Ser lie Val Asn Lys Asp Asn Ala Gly 

595 600 605 

Asn Tyr val Ser Asp Arg Thr He Ser Glu He Lys Ala Glu lie Asp 

610 615 620 

Ala Trp Leu Lys Asn Leu Lys Thr His Leu Val ser Lys Gly Ala Pro 
625 630 635 640 

val lie He Gly Glu Trp Gly Thr Asn Asn val Asp Ala Gly Gly Gly 
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fir 

645 










c en 


Met Phe Glu 


Tyr 


Asp 


Leu 


HIS 


Lys 


Asp 


Leu 


660 






665 




lie Ala Thr 


Lys 


Thr 


Met 


Lys 


Gin 


Asn 


Asp 






/■OA 

DOO 






Do j 


Thr 


Asp 


Gly 


Ala 


Pro 


Arg 


Thr 


Tyr Pro Ala 




695 






700 


Ala 


Leu 


Lys 
710 


Met 


Leu 


Gin 


Ala 


Tyr His Gly 
715 


Leu 


Pro 


Asp 


Ala 


Lys 


Asp 


Phe 


Pro Glu Gly 




725 






730 


Gly Glu Leu 


val 


Asn 


Phe 


Asn 


Ser 


Gin 


Trp 


740 










745 


He 


Asp 


Lys 


Thr 


Val 


Tyr 


Arg 


Gly He Lys 






760 






765 


Pro 


Ala 


Thr 


Gly 
775 


Ala 


Leu 


Ser 


Phe Lys val 
780 


Ala 


Thr 


Ala 


He 


Asn 


ser 


Lys 


Thr Pro Gin 






790 








795 


Thr 


Gly 


He 


Gin 


Lys 


lie 


Asn 


Leu Gin Trp 




805 








810 


Asn Leu lie 


ser 


lie 


Lys 


lie 


Lys 


ser 


val 


820 






825 




Ala Trp Gly 


Glu 


Pro 


cys 


Ser 


Leu 


Lys 


val 








840 




845 


Asn 


Tyr 


Ala 


Thr 


Gly 


He 


Glu 


Asp lie Thr 






855 






860 


Asp 


Asp 


Gly 


lie 


He 


Tyr 


Asn 


Leu ser Gly 


870 










875 


Gin 


Arg 


Gly 


He 


Tyr 


He 


Leu 


Asn Gly Lys 




885 






890 





655 



675 
Gly Leu 
690 

. Asp Leu - . - - - „ . • 

705 710 715 720 

- Lys lie ^ 

735 
Thr lie 
750 

lie Asp Lys Thr Val Tyr Arg Gly He Lys 

'755 
Glu Lys 
770 

™ Glu Lys 

785 790 795 , 800 

815 
Lys His 
830 

Glu Pro Cys Ser Leu Lys val Ala Trp Gly 

835 
Asp Gin 
850 

Arg His _. _ 

865 870 875 n 880 

_ . - - !i e - - 

895 
Lys 

<210> 259 
<211> 1143 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 259 

atgaagaaaa ttegcttact ccagggtgtt tcgttggcca tgtcaataat gtttcttttg 60 

tcatgtcagg cacaaaaacc agttgactct cttaaggaag catttgatgg tttgtttctt 120 

ataggtactg ccatgaacac ccctcagatc accggccagg atacacaaac acttgagttg 180 

ataaaaaaac acatgaactc catagtggcc gaaaatgtaa tgaaaagtga ggtgcttcaa 240 

cccagggaag gagagtttga ttttactctt gecgatcagt ttgttcaatt tggtatcgat 300 

aacaatatgc atatagttgg ccataccctt atatggcatt cccaggcgcc acgatggttt 360 

tttgtggatg agaaeggaaa cgatgtgagc cccgaaattc tgaaacaaag aatgaaagac 420 

catatttata cegtagtagg ccgttataaa ggcaaaattc atggatggga tgtggtgaat 480 

gagtgtataa atgacgatgg ttcgtggcgc aatagtaagt tttaccaaat tcttggtgaa 540 

gattttgtta aatatgeatt ecagtttgea getgaagecg atcccgatgc agagctttat 600 

tacaatgatt attcgatgtt ccttccagga cgtagggaag gcgtaattaa gatggtgaga 660 

aatctgeagg aacagggaat taaaattgat ggtattggga tgeagggeca cctgatgatt 720 

gattatccac ccctcgaaga ttttgaaacg agtatactgg ettttgeega tctgggggtg 780 

aatgtcatga taaccgaact cgatatatcc gttttgecat ttcctacccg caacgtgggc 840 

gccgatgttt ctctgaacat tgcatacaat actgaattaa atccctaccc gaatggctta 900 

cccgaagatg tagegcagaa attacataat cggtgggtgg atctttttcg cctgttcatt 960 

aaacaccacg ataaaattac ccgtgtaacc acttggggta cagccgatgc catgtcatgg 1020 

aagaataact ggcccattcg tggaegtaca gattatccct tacttttcga tegegatttt 1080 

cagcccaaac cctttgtcgc tgatataatt aaggaggcat tggcagccaa aagaaaatta 1140 

taa 1143 

<210> 260 
<211> 380 
<212> PRT 
<213> unknown 
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<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(24) 

<400> 260 

Met Lys Lys lie Arg Leu Leu Gin Gly Val Ser Leu Ala Met ser lie 

^ 5 10 15 

Met Phe Leu Leu ser cys Gin Ala Gin Lys Pro val Asp Ser Leu Lys 

20 25 30 

Glu Ala Phe Asp Gly Leu Phe Leu lie Gly Thr Ala Met Asn Thr Pro 

35 40 45 

Gin He Thr Gly Gin Asp Thr Gin Thr Leu Glu Leu lie Lys Lys His 

50 55 60 

Met Asn Ser He val Ala Glu Asn Val Met Lys Ser Glu val Leu Gin 
65 70 75 80 

Pro Arg Glu Gly Glu Phe Asp Phe Thr Leu Ala Asp Gin Phe val Gin 

85 90 95 

Phe Gly He Asp Asn Asn Met His lie val Gly His Thr Leu He Trp 

100 105 HO 

His Ser Gin Ala Pro Arg Trp Phe Phe Val Asp Glu Asn Gly Asn Asp 

115 120 125 

Val Ser Pro Glu lie Leu Lys Gin Arg Met Lys Asp His lie Tyr Thr 

130 135 140 

val val Gly Arg Tyr Lys Gly Lys lie His Gly Trp Asp val val Asn 
145 150 155 160 

Glu Cys lie Asn Asp Asp Gly ser Trp Arg Asn Ser Lys Phe Tyr Gin 

165 170 175 

lie Leu Gly Glu Asp Phe Val Lys Tyr Ala Phe Gin Phe Ala Ala Glu 

180 185 190 

Ala Asp Pro Asp Ala Glu Leu Tyr Tyr Asn Asp Tyr ser Met Phe Leu 

195 200 205 . 

pro Gly Arg Arg Glu Gly Val lie Lys Met val Arg Asn Leu Gin Glu 

210 215 220 

Gin Gly lie Lys lie Asp Gly lie Gly Met Gin Gly His Leu Met He 
225 230 235 , 240 

Asp Tyr Pro Pro Leu Glu Asp Phe Glu Thr Ser lie Leu Ala Phe Ala 

245 250 255 

Asp Leu Gly Val Asn Val Met lie Thr Glu Leu Asp lie ser val Leu 

260 265 270 

pro Phe Pro Thr Arg Asn val Gly Ala Asp val Ser Leu Asn lie Ala 

275 " 280 285 

Tyr Asn Thr Glu Leu Asn Pro Tyr Pro Asn Gly Leu Pro Glu Asp val 

290 295 300 

Ala Gin Lys Leu His Asn Arg Trp val Asp Leu Phe Arg Leu Phe lie 
305 310 315 320 

Lys His His Asp Lys lie Thr Arg val Thr Thr Trp Gly Thr Ala Asp 

325 330 335 

Ala Met Ser Trp Lys Asn Asn Trp Pro lie Arg Gly Arg Thr Asp Tyr 

340 345 350 

pro Leu Leu Phe Asp Arg Asp Phe Gin Pro Lys Pro Phe Val Ala Asp 

355 360 365 

lie He Lys Glu Ala Leu Ala Ala Lys Arg Lys Leu 
370 375 380 

<210> 261 
<211> 1629 
<212> DNA 
<213> Unknown 

<223> Obtained from an environmental sample. 
<400> 261 

atgataaaca aaattggcaa aggttttttt tctgcgttca tttgtgctgc tgcgttgagt 60 
gtctccacag ttaatgctca gcaaactgtc accaccaaca cgcaaggcac gcacgatggt 120 
tttttctatt cgttttggaa agacagtggt gatgcatcat ttggtttgcg tgagggaggg 180 
cgttacacct cgcaatggaa tacttctacc aataactggg tgggtggaaa agggtggaat 240 
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cccggtggta gaagggttgt tcactatcaa ggccaatata atgttgataa ttcacaaaac 300 

tcttatttgg cattgtatgg ctggacacgc tcaccactga ttgaatatta cgtgattgaa 360 

agttacggct cgtataaccc gtcgaattgc acccaaggtc ggcagaccta tggcaccttt 420 

cagagtgatg gtgcaaccta tgaaattgtt cgctgtcagc gagttcagca gccctctatc 480 

gatggcacac aaactttcta tcaatacttc agtgtgcgtc agccgaagaa aggctttggt 540 

agtatcagtg gtacgatcac tgtgggcaac cattttgatg catgggccgc cgccggtttg 600 

aacctggggg aacatgatta tatggtgatg gctaccgagg gttatcagag caccggtagt 660 

tcggatatta cggtcagtga aattaccggt ggttcaggtg gtggctcttc ctcgggtgct 720 

aataccctgg tgattcgtgc tgtgggcacc tctggtaatg aattgctgcg tgtcaatgtg 780 

ggtggtagcc ctgtgcagac attgagcctt tcgaccagtt ggcaggattt tactgtcaat 840 

acggatgcaa cgggtgacat taacgtagag ttgtttaatg atcagggtca gggttatgag 900 

gcgcgtatcg attatgtgct ggttaatggt gagacccgct acgcggccga tcagagttat 960 

aacaccagtg cctgggacgg cgaatgtggg ggtggctctt ttacccagtg gatgcattgt 1020 

gatggcatga ttggctttgg tgatatgacc ggcggcaatg ccggtggtgg cggttcttcg 1080 

ggtggttctg gcgccaatac tctggtggtg cgtgctgtcg gcacttcagg taacgagcag 1140 

ttgcgcgtga atgtgggcgg caacacgatt caaacactga acctgtcaag cagttggcaa 1200 

gattttactg tcaataccga tgcctcgggc gatattaacg tagagctgtt taatgaccag 1260 

ggtcagggct atgaggcgcg tattgattat gtgctggtta atggcgagac ccgctacgcg 1320 

gctgaccaga gttataacac cagcgcctgg gatggcgaat gcgggggtgg ctcttttacc 1380 

caatggatgc attgtgatgg catgattggt tttggtgata tgtcgggtgg tggttctgct 1440 

gtgggtacaa gcagtagcgg taatgccggc agcaatacca gcagtgcctg ttactgtaat 1500 

tggtatggca gtgtgatggc ttcttgtgaa aatcaggtga acggctgggg ttgggaaaat 1560 

aatcaaagct gtattggtaa taatacctgt aataatcagg gcggtagcgg aggcgtggtg 1620 

tgcaattaa 1629 

<210> 262 
<211> 542 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(26) 

<400> 262 

Met lie Asn Lys He Gly Lys Gly Phe Phe Ser Ala Phe He Cys Ala 

1 5 10 15 

Ala Ala Leu ser Val ser Thr val Asn Ala Gin Gin Thr val Thr Thr 

. . 20 25 30 

Asn Thr Gin Gly Thr His Asp Gly phe phe Tyr Ser Phe Tro Lys asd 

35 40 45 

Ser Gly Asp Ala Ser Phe Gly Leu Arg Glu Gly Gly Arg Tyr Thr Ser 

, 50 55 60 

Gin Trp Asn Thr ser Thr Asn Asn Trp val Gly Gly Lys Gly Trp Asn 
65 70 75 80 

Pro Gly Gly Arg Arg val Val His Tyr Gin Gly Gin Tyr Asn Val Asp 

85 90 95 

Asn ser Gin Asn Ser Tyr Leu Ala Leu Tyr Gly Trp Thr Arg ser Pro 

100 105 K HO 

Leu lie Glu Tyr Tyr val lie Glu Ser Tyr Gly ser Tyr Asn Pro Ser 

115 120 125 

Asn cys Thr Gin Gly Arg Gin Thr Tyr Gly Thr Phe Gin Ser Asp Gly 

, 130 135 140 

Ala Thr Tyr Glu lie Val Arg cys Gin Arg Val Gin Gin Pro ser He 
145 , t „ _ 150 155 160 

Asp Gly Thr Gin Thr Phe Tyr Gin Tyr Phe ser val Arg Gin Pro Lys 

• ■ , 165 170 y 175 

Lys Gly Phe Gly ser lie Ser Gly Thr He Thr Val Gly Asn His Phe 

180 185 190 

Asp Ala Trp Ala Ala Ala Gly Leu Asn Leu Gly Glu His asd Tyr Met 

1?5 200 205 

Val Met Ala Thr Glu Gly Tyr Gin Ser Thr Gly ser ser Asp lie Thr 

210 215 220 

val ser Glu He Thr Gly Gly Ser Gly Gly Gly ser ser ser Gly Ala 
225 230 235 240 

Asn Thr Leu Val He Arg Ala Val Gly Thr Ser Gly Asn Glu Leu Leu 
245 250 255 
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Arg val Asn val Gly Gly Ser Pro Val Gin Thr Leu ser Leu ser Thr 

, 260 265 270 

ser Trp Gin Asp phe Thr val Asn Thr Asp Ala Thr Gly Asp He Asn 

275 280 285 

Val Glu Leu Phe Asn Asp Gin Gly Gin Gly Tyr Glu Ala Arg lie Asp 

290 295 300 

Tyr val Leu val Asn Gly Glu Thr Arg Tyr Ala Ala Asp Gin ser Tyr 
305 . 310 315 320 

Asn Thr ser Ala Trp Asp Gly Glu cys Gly Gly Gly ser Phe Thr Gin 

325 330 335 

Trp Met His cys Asp Gly Met lie Gly Phe Gly Asp Met Thr Gly Gly 

340 345 350 

Asn Ala Gly Gly Gly Gly ser ser Gly Gly ser Gly Ala Asn Thr Leu 
355 360 365 

Val Yll Arg Ala Val Glv ser Gl y Asn Glu Gin Leu Arg val Asn 

. 370 375 380 

Val Gly Gly Asn Thr He Gin Thr Leu Asn Leu Ser ser Ser Trp Gin 
385 . , 390 395 400 

Asp Phe Thr val Asn Thr Asp Ala Ser Gly Asp He Asn Val Glu Leu 
. 405 410 415 

Phe Asn Asp Gin Gly Gin Gly Tyr Glu Ala Arg He Asp Tyr Val Leu 

420 425 430 

val Asn Gly Glu Thr Arg Tyr Ala Ala Asp Gin ser Tyr Asn Thr ser 

435 440 445 

Ala Trp Asp Gly Glu cys Gly Gly Gly Ser Phe Thr Gin Trp Met His 

450 455 460 

Cys Asp Gly Met lie Gly phe Gly Asp Met ser Gly Gly Gly ser Ala 
465 470 475 480 

Val Gly Thr ser Ser Ser Gly Asn Ala Gly ser Asn Thr Ser ser Ala 

485 490 495 

cys Tyr cys Asn Trp Tyr Gly ser val Met Ala ser Cys Glu Asn Gin 

, 500 505 510 

val Asn Gly Trp Gly Trp Glu Asn Asn Gin Ser Cys lie Gly Asn Asn 

515 520 525 

Thr cys Asn Asn Gin Gly Gly ser Gly Gly val val Cys Asn 
530 535 540 

<210> 263 

<211> 1092 

<212> DNA 

<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 263 

atgaaaacta atcacccatt taaattcggg aaaaaaatat gtatggcatt ggctttgctg 60 

gtgcttggca tacaggcttc aatcgcacag gaaatttgta ttaccagcgg cactgaccag 120 

ccacatccaa cggctatacc cacgaactat ggaatcagga cacccggggg 180 

acg H C £2 ta tsactattaa tgcaggcacc acttacagtg cgcggtggaa cggtgcattt 240 

aactatttgg cccgccgtgg attggcctac gatggttcgt ccctcaccca tgctgaccgg 300 

gggaaattca ccataaatta tgcctctaac tacaactgca acaatatgaa tgggctctct 360 

tatttaagcg tgtacggatg gacgcgggat tttgccaagg aaaatgccaa tccggcagga 420 

tcacaggctc atcaggaagc gctggtggaa tattacattg ttgaaaactg gtgcgactgg 480 

aagaccctaa cgcccagagt ctgggcaccc tgaatgttga tgggtcgatc 540 

^ a 3i a J a l55 atcgcacaga acggatcaac caaccttcta tcaggtgcgg tggtacctgc 600 

accaatactt cagcattcgc cgcaacacac gtaacagtgg caccattglt 660 

2JSS«SS tc atttcaac ca gtgggaagca ttaaccggcg tccctatggg tggcctgcac 720 

gaagtgatga tgaaggtcga aggctacaac tcaaacaatc aatccagtgg caatgtaagc 780 

SESS** t 2 ctcat 9 c 9 tgcccgcttc gaggatggcg ccattgtcga gaaccagaat 840 

gcggtcggcc atgcgcacgg tggagaagcg gtgggagatg atcaccgccg tcttgccctg 900 

ggccaggccc ttgaagcggg cgaacacctc ggcctcggcc ttggcgtcga gggcggcggt 960 

fSSESSP* a 9 aat 9 atca actcggcgtc gcgcatatag gcgcgggcga tggctacctt 1020 

ctgccactcg ccgcccgaga ggtcgcggcc ctgcttgaaa aggcgcccca attgctccag 1080 

agaaacgggt ga 1092 

<210> 264 
<211> 363 
<212> PRT 
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<213> Unknown 
<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(29) 

<400> 264 

Met Lys Thr Asn His Pro Phe Lys Phe Gly Lys Lys He Cys Met Ala 
1 , 5 10 15 

Leu Ala Leu Leu Val Leu Gly lie Gin Ala ser lie Ala Gin Glu lie 

20 25 30 

Cys lie Thr Ser Gly Thr Asp Gin lie Arg Glu Thr Thr Ser Asn Gly 

35 40 45 

Tyr Thr His Glu Leu Trp Asn Gin Asp Thr Arg Gly Thr Ala cys Met 

50 55 60 

Thr He Asn Ala Gly Thr Thr Tyr ser Ala Arg Trp Asn Gly Ala Phe 
65 . 70 75 60 

Asn Tyr Leu Ala Arg Arg Gly Leu Ala Tyr Asp Gly Ser ser Leu Thr 

. n 85 90 95 

His Ala Asp Arg Gly Lys Phe Thr lie Asn Tyr Ala ser Asn Tyr Asn 

100 105 110 

Cys Asn Asn Met Asn Gly Leu Ser Tyr Leu Ser Val Tyr Gly Trp Thr 

115 120 125 

Arg Asp Phe Ala Lys Glu Asn Ala Asn Pro Ala Gly Ser Gin Ala His 

130 135 140 

Gin Glu Ala Leu Val Glu Tyr Tyr He Val Glu Asn Trp cys Asp Trp 
145 150 155 160 

Asn val ser Gin Asp pro Asn Ala Gin Ser Leu Gly Thr Leu Asn Val 

165 170 175 

Asp Gly ser He Tyr Asp Met Tyr Arg Thr Glu Arg He Asn Gin Pro 

180 185 190 

Ser lie Arg cys Gly Gly Thr cys Asp Asn Phe Tyr Gin Tyr Phe ser 

195 200 205 

He Arg Arg Asn Thr Arg Asn Ser Gly Thr lie Asp Val ser Ala His 

, 210 215 220 

Phe Asn Gin Trp Glu Ala Leu Thr Gly Val Pro Met Gly Gly Leu His 
225 230 235 240 

Glu Val Met Met Lys val Glu Gly Tyr Asn Ser Asn Asn Gin Ser Ser 

245 250 255 

Gly Asn val ser Phe Thr Gin Leu Leu Met Arg Ala Arg Phe Glu Asp 

260 265 270 

Gly Ala lie Val Glu Asn Gin Asn Ala Val Gly His Ala His Gly Gly 

n 27 , 5 . 2 80 285 

Glu Ala val Gly Asp Asp His Arg Arg Leu Ala Leu Gly Gin Ala Leu 

290 295 ~ 300 

Glu Ala Gly Glu His Leu Gly Leu Gly Leu Gly Val Glu Gly Gly Gly 
305 310 315 320 

Gly Phe Val Glu Asn Asp Gin Leu Gly Val Ala His lie Gly Ala Gly 

325 330 335 

Asp Gly Tyr Leu Leu Pro Leu Ala Ala Arg Glu Val Ala Ala Leu Leu 

n 340 345 350 

Glu Lys Ala Pro Gin Leu Leu Gin Arg Asn Gly 
355 360 

<210> 265 

<211> 996 

<212> DNA 

<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 265 

atgaacagct ccctcccctc cctccgcgat gtattcgcga atgatttccg catcggggcg 60 
gcggtcaatc ctgtgacgat cgagatgcaa aaacagttgt tgatcgatca tgtcaacagt 120 
attacggcag agaaccatat gaagtttgag catcttcagc cggaagaagg gaaatttacc 180 
tttcaggaag cggatcggat tgtggatttt gcttgttcgc accgaatggc ggttcgaggg 240 
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cacacacttg tatggcacaa ccagactccg gattgggtgt ttcaagatgg tcaaggccat 300 

ttcgtcagtc gggatgtgtt gcttgagcgg atgaaatgtc acatttcaac tgttgtacgg 360 

cgatacaagg gaaaaatata ttgttgggat gtcatcaacg aagcggtagc cgacgaagga 420 

gacgaattgt tgaggccgtc gaagtggcga caaatcatcg gggacgattt tatggaacaa 480 

gcatttctct acgcttatga agctgaccca gatgcactgc ttttttacaa tgactataat 540 

gaatgttttc cggaaaagag agaaaaaatt tttgcacttg tcaaatcgct gcgtgataaa 600 

ggcattccga ttcatggcat cggcatgcag gcgcactgga gcctgacccg cccgtcgctt 660 

gatgaaattc gtgcggcgat tgaacggtat gcgtcccttg gtgttgttct tcatattacg 720 

gaactcgatg tatccatgtt tgaatttcac gatcgtcgaa ccgatttggc tgtcccgacg 780 

aacgaaatga tcgaacagca agcagaacgg tatgggcaaa tttttgcttt gtttaaggag 840 

tatcgcgatg ttattcaaag tgtcacattt tggggaattg ctgatgacca tacatggctc 900 

gataactttc cagtgcacgg gagaaaaaac tggccgcttt tgttcgatga acagcataaa 960 

ccgaaaccag ctttttggcg ggcagtgagt gtctga " 996 

<210> 266 
<211> 331 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 266 

Met Asn ser ser Leu Pro ser Leu Arg Asp Val Phe Ala Asn Asp Phe 

1 5 10 15 

Arg lie Gly Ala Ala Val Asn pro val Thr lie Glu Met Gin Lys Gin 

, 20 25 30 

Leu Leu lie Asp His Val Asn Ser lie Thr Ala Glu Asn His Met Lys 

35 40 45 

Phe Glu His Leu Gin Pro Glu Glu Gly Lys Phe Thr Phe Gin Glu Ala 

50 55 60 

Asp Arg lie Val Asp phe Ala cys Ser His Arg Met Ala Val Arg Gly 
65 70 75 80 

His Thr Leu Val Trp His Asn Gin Thr Pro Asp Trp Val Phe Gin Asp 

85 90 95 

Gly Gin Gly His Phe Val Ser Arg Asp Val Leu Leu Glu Arg Met Lvs 

. „ 100 105 no 

Cys His lie ser Thr Val val Arg Arg Tyr Lys Gly Lys lie Tyr cys 

115 120 125 

Trp Asp val lie Asn Glu Ala val Ala Asp Glu Gly Asp Glu Leu Leu 

130 135 140 

Arg pro Ser Lys Trp Arg Gin lie lie Gly Asp Asp Phe Met Glu Gin 
!f 5 . „ 150 n 155 160 

Ala Phe Leu Tyr Ala Tyr Glu Ala Asp Pro Asp Ala Leu Leu phe Tyr 

165 170 175 

Asn Asp Tyr Asn Glu cys Phe Pro Glu Lys Arg Glu Lys He Phe Ala 

180 185 190 

Leu val Lys Ser Leu Arg Asp Lys Gly lie Pro lie His Gly lie Gly 

195 200 205 

Met Gin Ala His Trp Ser Leu Thr Arg Pro ser Leu Asp Glu lie Ara 

. 210 215 ~ 220 

Ala Ala lie Glu Arg Tyr Ala Ser Leu Gly val Val Leu His lie Thr 
225 230 235 240 

Glu Leu Asp val Ser Met Phe Glu Phe His Asp Arg Arg Thr Asp Leu 

245 250 255 

Ala Val Pro Thr Asn Glu Met lie Glu Gin Gin Ala Glu Arq Tyr Glv 

, n 260 265 270 

Gin He Phe Ala Leu Phe Lys Glu Tyr Arg Asp Val lie Gin ser val 

275 280 285 

Thr Phe Trp Gly lie Ala Asp Asp His Thr Trp Leu Asp Asn Phe Pro 

n 290 295 300 

Val His Gly Arg Lys Asn Trp pro Leu Leu Phe Asp Glu Gin His Lys 
305 310 315 320 

Pro Lys Pro Ala Phe Trp Arg Ala val ser val 
325 w 330 

<210> 267 
<211> 1956 
<212> DNA 
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<213> Bacteria 



<400> 267 

atgaagcgta 

atcatcctac 

acacatggag 

aacgacggtg 

gggagaaaat 

ggctgtgatt 

ccactggttg 

cccaaaggaa 

aatcagcctt 

aagagaacaa 

atgcgaatgg 

tacgctaatg 

caaagcccaa 

aattcctcca 

gaaaatggta 

tctgcaactg 

ggaactctac 

gtatctacaa 

ccagtcaatg 

aacacaagag 

aaccttcaaa 

tattccacta 

gttgccactc 

cttggtacaa 

aaccttagca 

gtggactact 

ggtacaagaa 

aaccttcaaa 

tattccacta 

gtagctaccc 

cttggaacaa 

accattagta 

gttgactggt 



aggttaagaa 
atagtatacc 
gctacgacta 
gtacttttag 
ttaattccga 
acaatccaaa 
aatattacat 
ccatcacagt 
ccatcgatgg 
gcggaacaat 
gtaagatgta 
tatataagaa 
ttagaagaga 
ctttacaagt 
ataccgtaac 
ttgcaacgga 
ttggtacctt 
acatcagcaa 
tggacaactt 
acgcatattc 
tctttagctt 
cctataataa 
agatctcaac 
tatatgttcc 
atattacagg 
tcgtatttac 
gtgcattttc 
tctttagctt 
cctataaaaa 
agaatgctac 
tttacgtggg 
atactgcggg 
ttgtattctc 



gatggcagct 
agtactcgcc 
tgagctctgg 
ttgtcaatgg 
caaaacctat 
cggaaattcc 
tgtagaaagc 
ggatggcggt 
aactgcgaca 
atctgtcact 
tgaagttgct 
tgaaatcaga 
tgcattttca 
gattggaacg 
ttacagcaat 
ggttaatacc 
atatgtaagt 
aattaccggc 
catatttagc 
tatcattcag 
accaggcggt 
cgttaatttc 
ttccattcag 
ttcgacaaat 
tgtgcatgat 
accagcaaat 
caatattcaa 
accaggtggt 
tattgatttt 
taccattcag 
gtccacagga 
tgtaaaagat 
aaaatcagga 



atggcaacga 
gggcgaataa 
aaagactacg 
agtaatatcg 
caagaattag 
tatttgtgtg 
tggggcagct 
acttatgaaa 
ttccaacaat 
gaacatttta 
cttaccgttg 
ataggtgcaa 
ataatcgaag 
ccaaataatg 
atagattttg 
tcaattcaaa 
tctaccggca 
gttcatgata 
agaagttcac 
gccgaggatt 
ggcagcgcca 
gccaacggct 
gtgagagcag 
agttgggatt 
attacccttg 
gtaaattcag 
gccgaagatt 
ggcagcgcca 
ggtgacggcg 
gtaagattgg 
agctttgata 
attgttcttg 
acttaa 



gtataattat 
tttacgacaa 
gaaatacgat 
gtaatgcact 
gagatatagt 
tttacggttg 
ggcgtccacc 
tatatgaaac 
attggagtgt 
aacagtggga 
aaggttatca 
atccaactcc 
cggaagaata 
gcagaggaat 
gtagtggtgc 
tccgttctga 
gctggaatac 
ttgtattggt 
cagtgcctgc 
atgacagcag 
ttggctatat 
taagttctat 
gaggagcaac 
cttatcagaa 
tcttttcagg 
ggcctacctc 
atgacagcag 
ttggctatat 
caacgtccgt 
gaagtccatc 
cttataggga 
tattttcagg 



ggctatcatg 
tgagacaggc 
tatggaactt 
atttagaaaa 
agttgaatat 
gacaagaaat 
tggagcaaca 
tacccgggta 
tcgtacatcc 
aagaatgggc 
gagcagtggg 
tgccccatct 
taacagcaca 
tggttatatt 
aacagggttc 
cagtcctatc 
atatcaaacc 
attctcaggt 
acctggtgat 
ttatggcccc 
tgaaaatggt 
aacagcaaga 
cggtacttta 
tgtaactgcc 
accagtgaat 
ccctgtcgga 
ttatggtccc 
tgaaaatggt 
aacagcaaga 
gggtacatta 
tgtatccgct 
tcctgttaat 



<210> 268 
<211> 651 
<212> PRT 
<213> Bacteria 

<220> 

<221> SIGNAL 
<222> (1) . . . (30) 

<400> 268 

Met Lys Arg Lys Val Lys Lys Met Ala 

1 5 
Met Ala lie Met lie lie Leu 
20 

lie lie Tyr Asp Asn Glu Thr 
35 

Leu Trp Lys Asp Tyr Gly Asn 

50 55 
Thr Phe ser cys Gin Trp ser Asn He 
65 70 
Gly Arg Lys Phe Asn ser Asp Lys Thr 
85 

Val val Glu Tyr Gly Cys Asp Tyr Asn 
100 105 
Cys val Tyr Gly Trp Thr Arg Asn Pro 

115 120 
Glu ser Trp Gly Ser Trp Arg Pro Pro 

130 135 
lie Thr val Asp Gly Gly Thr Tyr Glu 
145 150 
Asn Gin Pro ser lie Asp Gly Thr Ala 
165 



Ala Met Ala 
10 

His ser lie Pro val 
25 

Gly Thr His Gly Gly 
40 

Thr lie Met Glu Leu 
60 

Gly Asn Ala 
75 

Tyr Gin Glu 
90 

Pro Asn Gly 

Leu Val Glu 

Gly Ala Thr 
140 

lie Tyr Glu 

155 
Thr Phe Gin 
170 
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Thr ser 

Leu Ala 
30 

Tyr Asp 
45 

Asn Asp 

Leu Phe 

Leu Gly 

Asn Ser 
110 
Tyr Tyr 
125 

Pro Lys 
Thr Thr 
Gin Tyr 



lie lie 
15 

Gly Arg 

Tyr Glu 

Gly Gly 

Arg Lys 
80 
Asp lie 
95 

Tyr Leu 

lie val 

Gly Thr 

Arg Val 
160 
Trp Ser 
175 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1956 
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Val Arg Thr Ser Lys Arg Thr Ser Gly Thr lie ser val Thr Glu His 

180 " 185 190 

Phe Lys Gin Trp Glu Arg Met Gly Met Arg Met Gly Lys Met Tyr Glu 

, , 195 200 ~ 205 

Val Ala Leu Thr Val Glu Gly Tyr Gin Ser Ser Gly Tyr Ala Asn Val 

210 215 220 

Tyr Lys Asn Glu He Arg lie Gly Ala Asn pro Thr pro Ala Pro ser 
225 230 235 240 

Gin Ser Pro lie Arg Arg Asp Ala Phe ser lie He Glu Ala Glu Glu 

245 250 255 

Tyr Asn ser Thr Asn Ser Ser Thr Leu Gin Val lie Gly Thr Pro Asn 

260 265 270 

Asn Gly Arg Gly He Gly Tyr lie Glu Asn Gly Asn Thr Val Thr Tyr 

275 280 285 

Ser Asn lie Asp Phe Gly Ser Gly Ala Thr Gly Phe ser Ala Thr Val 

290 295 300 

Ala Thr Glu Val Asn Thr Ser lie Gin He Arg ser Asp Ser Pro lie 
305 310 315 320 

Gly Thr Leu Leu Gly Thr Leu Tyr Val Ser Ser Thr Gly Ser Trp Asn 

325 330 335 

Thr Tyr Gin Thr Val ser Thr Asn lie ser Lys lie Thr Gly val His 

, 340 345 350 

Asp lie val Leu Val Phe ser Gly Pro val Asn Val Asp Asn Phe lie 

355 360 365 

Phe Ser Arg Ser Ser Pro Val Pro Ala Pro Gly Asp Asn Thr Arg Asp 
n 370 375 380 

Ala Tyr ser He lie Gin Ala Glu Asp Tyr Asp Ser Ser Tyr Gly Pro 
385 390 395 400 

Asn Leu Gin He Phe ser Leu Pro Gly Gly Gly ser Ala lie Gly Tvr 

, , 405 4l6 415 

lie Glu Asn Gly Tyr Ser Thr Thr Tyr Asn Asn Val Asn Phe Ala Asn 

420 425 430 

Gly Leu Ser Ser lie Thr Ala Arg Val Ala Thr Gin lie ser Thr Ser 

435 440 445 

He Gin val Arg Ala Gly Gly Ala Thr Gly Thr Leu Leu Gly Thr He 

450 455 460 

Tyr val Pro Ser Thr Asn ser Trp Asp ser Tyr Gin Asn val Thr Ala 
465 470 475 480 

Asn Leu Ser Asn lie Thr Gly Val His Asp He Thr Leu Val Phe Ser 

485 490 495 

Gly pro val Asn Val Asp Tyr Phe Val Phe Thr Pro Ala Asn val Asn 

500 505 510 

ser Gly pro Thr Ser pro val Gly Gly Thr Arg Ser Ala Phe Ser Asn 

_ , 515 520 525 

He Gin Ala Glu Asp Tyr Asp Ser ser Tyr Gly Pro Asn Leu Gin lie 

, 530 535 540 

Phe ser Leu Pro Gly Gly Gly Ser Ala lie Gly Tyr He Glu Asn Gly 
545 550 555 560 

Tyr ser Thr Thr Tyr Lys Asn He Asp Phe Gly Asp Gly Ala Thr Ser 

565 570 575 

Val Thr Ala Arg Val Ala Thr Gin Asn Ala Thr Thr lie Gin Val Arg 

580 585 590 

Leu Gly Ser Pro Ser Gly Thr Leu Leu Gly Thr He Tyr val Gly Ser 

595 600 605 

Thr Gly ser Phe Asp Thr Tyr Arg Asp val ser Ala Thr lie ser Asn 

, 610 615 620 

Thr Ala Gly val Lys Asp He Val Leu val Phe Ser Gly Pro Val Asn 
625 630 635 640 

val Asp Trp Phe val Phe ser Lys ser Gly Thr 
645 ' 650 

<210> 269 
<211> 1110 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
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<400> 269 

atggggtaca ataggatcat acaagcgatc cgcgtaagca agggagatgt tttgggcgtt 60 

cataaagttt tttacgctgc acttgcgtgt gtggcgatgg ggtattcgga aacgtgggca 120 

cagtgcgcga cctggacccg aagcaccatt cgcaattgcg agggcatcga ctacgagttg 180 

tggaaccaga acaaccgcgg cacggtcaac atggaaatca cgggaaacgg aacgttcgcg 240 

gcgacgtgga gcggaacgga aaacatcctg tttcgcgccg gcaagaaatg ggggttcaac 300 

agcaccacga cggcgcggtc ggtcggcgcc atcacgctcg atttcgctgc gacctggacc 360 

tccagcgaca acgtgaaaat gctcggcatc tacggctggg cgtattaccc gtcgggaagc 420 

gagccgacga aaacggaaag cggtcaaaac acgagctttt ccgatcagat cgagtattac 480 

atcatccagg accgcggagg cttcaacccg ggttccggcg gcgtcaacgc caaaaagtac 540 

ggcgaggccg tgatcgacgg aatcgcctat gacttttggg tggccgaccg gatcaaccag 600 

cccatgctga caggaagagg caacttcaag caatacttca gcgttccacg gaacacgagc 660 

agccaccggc aaagcggcat cgtcagcatt tcgaagcact ttgaggagtg ggacaaggcc 720 

ggcatgaaga tgctggactg tccgctatac gaagtcgcga tgaaggtgga atcgtatacg 780 

ggctcggcga atggcggcgg gtcggcgaac gtgacccgga atattctcac gctcggcggt 840 

tcttccgcac cgacccctat cgcgcgcggc cccggccggt ccgccgaaag catgcgggtc 900 

gccttcgttc aggaaagagt gctcaaggtc gcgcccgtcg acggaacccg cctgcaagtc 960 

aaggtgcggg acgtgaaggg cgtgaaccgg gccgagttca atgccgcggg cgcggcaacg 1020 

ttctcgttgt cccatgtccc cgcgggcccg tatttcctgg atgtgacggg gccggatgta 1080 

agacagatca cgccgttcgt tttgcgataa " ^ 1110 

<210> 270 

<211> 369 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 270 

Met Gly Tyr Asn Arg lie lie Gin Ala lie Arg Val ser Lys Gly asp 

15 10 15 

val Leu Gly Val His Lys Val Phe Tyr Ala Ala Leu Ala Cys val Ala 

20 25 30 

Met Gly Tyr ser Glu Thr Trp Ala Gin Cys Ala Thr Trp Thr Arg ser 

35 40 45 

Thr lie Arg Asn Cys Glu Gly lie Asp Tyr Glu Leu Trp Asn Gin Asn 

50 55 60 

Asn Arg Gly Thr Val Asn Met Glu He Thr Gly Asn Gly Thr Phe Ala 
65 70 75 80 

Ala Thr Trp ser Gly Thr Glu Asn lie Leu Phe Arg Ala Gly Lys Lys 

85 90 95 

Trp Gly Phe Asn ser Thr Thr Thr Ala Arg ser Val Gly Ala He Thr 

, 100 105 110 

Leu Asp Phe Ala Ala Thr Trp Thr ser ser Asp Asn val Lys Met Leu 

115 120 125 

Gly He Tyr Gly Trp Ala Tyr Tyr Pro ser Gly Ser Glu Pro Thr Lys 

. 130 135 140 Y 

Thr Glu ser Gly Gin Asn Thr Ser Phe ser Asp Gin lie Glu Tyr Tyr 
145 n n 150 155 160 

lie lie Gin Asp Arg Gly Gly Phe Asn Pro Gly ser Gly Gly val Asn 

165 170 175 

Ala Lys Lys Tyr Gly Glu Ala val lie Asp Gly He Ala Tyr asd Phe 

, n 180 185 . 1§0 

Trp Val Ala Asp Arg lie Asn Gin Pro Met Leu Thr Gly Arg Gly Asn 

1?5 200 205 

Phe Lys Gin Tyr Phe ser val Pro Arg Asn Thr Ser Ser His Arg Gin 

210 215 220 y 

ser Gly lie val ser lie ser Lys His Phe Glu Glu Trp Asp Lys Ala 
225 230 235 240 

Gly Met Lys Met Leu Asp cys Pro Leu Tyr Glu val Ala Met Lys Val 

245 250 255 

Glu ser Tyr Thr Gly ser Ala Asn Gly Gly Gly Ser Ala Asn Val Thr 

_ 260 265 270 

Arg Asn lie Leu Thr Leu Gly Gly Ser ser Ala Pro Thr pro lie Ala 

, 275 280 285 

Arg Gly Pro Gly Arg ser Ala Glu ser Met Arg val Ala Phe Val Gin 

290 295 300 

Glu Arg Val Leu Lys Val Ala Pro val Asp Gly Thr Arg Leu Gin val 

Page 196 



WO 03/106654 



PCT/US03/19153 



305 310 315 320 

Lys Val Arg Asp val Lys Gly val Asn Arg Ala Glu Phe Asn Ala Ala 

325 330 335 

Gly Ala Ala Thr Phe Ser Leu Ser His Val Pro Ala Gly Pro Tvr Phe 

„ 340 345 350 

Leu Asp val Thr Gly Pro Asp Val Arg Gin lie Thr Pro Phe val Leu 
355 360 365 

Arg 

<210> 271 
<211> 1128 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 271 

atgttcattc acaacagcat atgcagcgca ctctgcacaa tctttttggc aactgcaaca 60 

atgggagaaa acatgacact acaagaagcc tttgccgatc acttttatgt gggagccgcc 120 

atcagccaac gcctttttca accagatcgc gccgaaacgc tgcaactggc cgcgcaccaa 180 

ttcaacagca tcacagccga aaatgagatg aagtggcagt cgttaaatcc cactcctggc 240 

gaataccgtt tcgaaaacgc cgataaattc gtccgctttg gtgtcgaaaa cgatatgtac 300 

atcgttgggc acgttctctt ctggcacagc cagacacccg actggctctt caaggatgac 360 

gacggtaact tcgtctcccg cgaagtctta ctcgaccgca tgcgcgccca cgtgcgcaat 420 

cttgtccagc gctacggcaa ccatgtgcac gcctgggatg ttatcaatga aaccttcaat 480 

gataatggtt ccttgcgcga cagcccatgg acgcaaatcc tcggcgagga attcatcgag 540 

cacgccttcc ggattgccgg cgaggaactc cccccccatg tcgagctgct ctacaatgat 600 

tattcgatga ccattcctgc caagcgcgat gctgttgctg aaatggttcg cgacctcata 660 

gccaaaggca tccgcattga cggcgttggc atgcagggac attgggcacg gacccacccg 720 

accatagcgg acatagaaaa aagcattctt gccttcgcag gaaccggcgt acaggtacac 780 

atcactgagc tcgacatcga catgctgcca cgccatcccc agatgtttac tggtggtgca 840 

gacaccatgt tgcgcctaca acaagatccc aaactcgacc cctacactga gggacttcca 900 

gcggaagatc agcaggcatt ggcagaacgc tacgcaagca tcttccgttt attcttgaag 960 

cacagcgatg ttattcgccg tgtcaccttc tggggggtca ccgatgccca cacctggctc 1020 

aacaattggc ccatccgtgg ccgcaccagc catcccctgc tcttcgaccg ccagaacaac 1080 

cccaaacccg ccttccacgc cgtcgtcaga ctgaagaccg aagactga 1128 

<210> 272 
<211> 375 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(22) 

<400> 272 

Met Phe He His Asn ser He Cys Ser Ala Leu Cys Thr He Phe Leu 

1 5 10 15 

Ala Thr Ala Thr Met Gly Glu Asn Met Thr Leu Gin Glu Ala Phe Ala 

, 2 0 25 30 

Asp His Phe Tyr Val Gly Ala Ala He ser Gin Arg Leu Phe Gin Pro 

35 40 45 

Asp Arg Ala Glu Thr Leu Gin Leu Ala Ala His Gin Phe Asn ser He 

L 50 55 60 

Thr Ala Glu Asn Glu Met Lys Trp Gin Ser Leu Asn Pro Thr Pro Gly 
6 5 L „ 70 75 80 

Glu Tyr Arg Phe Glu Asn Ala Asp Lys Phe Val Arg Phe Gly Val Glu 

85 90 95 

Asn Asp Met Tyr He val Gly His val Leu Phe Trp His ser Gin Thr 

1Q 0 , 105 110 

Pro Asp Trp Leu Phe Lys Asp Asp Asp Gly Asn Phe Val ser Arg Glu 

115 120 125 

val Leu Leu Asp Arg Met Arg Ala His Val Arg Asn Leu Val Gin Arg 
130 135 140 
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lyr 


o I y 


A c r» 

Asn 


Li -I e 

Hi S 


va I n i s Ala 


Trp 


ASp 


va i 


lie Asn 


G 1 U 


Thr 


Phe 


Asn 


145 




61 y 




i K(\ 












lOU 


A Cfl 


Asn 


ser 


1 Mil A MM A M 

Leu Arg asd 


ser 


Pro 


Trp 


Tnr Gin 


I le 


Leu 


Gly 


GlU 


Glu 


Phe 






IDj 






1 id 
l/U 








175 




He 


G 1 U 


LI A f A 1 n nl* s\ 

his Ala Pne 


Arg 


lie 


Ala 


Gly Glu 


Glu 


Leu 


Pro 


Pro 


HI s 


val 


G 1 U 






lOD 






1 HA 

lyo 






Leu 


Leu Tyr Asn 


Asp 


Tyr 


ser 


Met Thr 


He 


Pro 


Ala 


Lys 








Va 1 




200 






205 






Arg 


ASD 


Ala 


Ala Glu Met 


Va 1 


Arg 


Asp 


Leu He 


Ala 


Lys 


Gly 


lie 




CXI) 






215 




220 






Arg 


He 


Asp 


Gly 


Val Gly Met 


Gin 


Gly 


HIS 


Trp Ala 


Arg 


Thr 


His 


Pro 


CCD 


lie 


Ala 




nan 
230 






235 






240 


Thr 


Asp 


lie Glu Lys 


ser 


lie 


Leu 


Ala Phe 


Ala 


Gly 


Thr 


Gly 


Val 


Gin 


Val 




245 






250 






255 


HIS 


lie Thr Glu 


Leu 


Asp 


lie 


Asp Met 


Leu 


Pro 


Arg 


His 




Gin 




260 






265 






270 




Pro 


Met 


Phe 


Thr Gly Gly 


Ala 


Asp 


Thr 


Met Leu 


Arg 


Leu 


Gin 


Gin 






275 




280 






285 








Asp 


Pro 


Lys 


Leu 


Asp Pro Tyr 


Thr 


Glu 


Gly 


Leu Pro 


Ala 


Glu 


Asp 


Gin 


b i n 


290 






295 






300 








AT n 

a ia 


Leu 


a ia 


Glu Arg Tyr 


Ala 


ser 


T i _ 
lie 


phe Arg 


Leu 


Phe 


Leu 


Lys 


305 






val 


310 








315 








320 


His 


Ser 


Asp 


lie Arg Arg 


Val 


Thr 


Phe 


Trp Gly 


Val 


Thr 


Asp 


Ala 


His 


Thr 






325 






330 






335 




Trp 


Leu 


Asn Asn Trp 


Pro 


lie 


Arg 


Gly Arg 


Thr 


Ser 


His 


Pro 






Phe 


340 






345 




350 






Leu 


Leu 


Asp 


Arg Gin Asn 


Asn 


Pro 


Lys 


Pro Ala 


Phe 


His 


Ala 


Val 


val 




355 


360 






365 








Arg 


Leu 


Lys 


Thr Glu Asp 




















370 




375 



















<210> 273 
<211> 1134 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 273 

atggtttcat cgctaatcaa ttcttcatac attcggctca agcactattc gtgctcaagt 60 

ttattgctcc tgacattggc agcctgtggc ggccagcagc ctcccccgga tacgggatcc 120 

agcacttcaa gttcaagcag ttcttcgagc tccagttcaa gcagctcttc aagttccagc 180 

tcaagcagtt cttccagctc cagctcgagc agctcttcga gttcgagctc ttcatcatcc 240 

agctcttcag gggcaaaccc gccaccgacc gggggcaagt tcgtcggcaa catcacgacc 300 

cgaggcgccg tccaagcgga cttcattcag tactgggatc aaattacgcc ggagaacgag 360 

ggcaaatggg gttctgtgga aggaactcgc gaccagtaca actgggcgcc tcttgatcgc 420 

atctatgact atgcacgtca gcacaatatc ccagtcaaag cgcatacgct ggtttggggt 480 

gcacaggctc caggctggat caacaatctg agtgcggccg agcagcgtga ggaaatcgag 540 

gaatggattc gtgattactg cacgcgttac ccagacaccc aaatgatcga cgtagttaac 600 

gaggcgcacc caaatcacgc ccccgctcgc tatgcgcaga atgccttcgg caatgactgg 660 

attaccgaag cgttcaaact ggcgcgccgg cactgcccca acgccatttt gatctacaac 720 

gactataatt tcatcacttg ggataccgat gaaatcatgg cgctgattcg cccggctatc 780 

gcagcagggg tagtggatgc ggtagggctg caggcgcata gcttgtatcc tgacgaatac 840 

gctaacaaga tgtggagtgc cgctgaaata cagcagaagc tcgatctgat ctctaccctt 900 

ggcgtgccga tgtatatttc ggaatatgat gtcgccaagt ccaatgacca agagcagttg 960 

gcgattttca gcgagcagtt cccggtcctt tacgaacacc ccaatgtcgt aggtgtaacc 1020 

ctctggggct atattgatgg agcgacgtgg cgcgccggct cgggcttgat tcgaaacggt 1080 

cagcaccggc ccgccatgca atggctgctc gagtacttgg agaacaatcg atag 1134 

<210> 274 
<211> 377 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<221> SIGNAL 
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<222> (1)...(74) 
<400> 274 

Met Val Ser Ser Leu lie Asn ser Ser Tyr lie Arg Leu Lys His Tvr 

1 5 10 15 

Ser cys ser ser Leu Leu Leu Leu Thr Leu Ala Ala Cys Gly Gly Gin 

20 25 30 

Gin Pro pro pro Asp Thr Gly ser Ser Thr ser Ser Ser Ser ser ser 

35 40 45 

ser ser ser ser Ser ser ser ser ser Ser ser Ser Ser Ser ser Ser 

50 55 60 

Ser Ser ser ser ser ser ser ser Ser ser ser ser ser ser ser ser 
65 70 75 80 

Ser Ser Ser Gly Ala Asn Pro Pro Pro Thr Gly Gly Lys Phe val Gly 

85 90 95 

Asn lie Thr Thr Arg Gly Ala Val Gin Ala Asp Phe He Gin Tyr Trp 

100 105 110 

Asp Gin lie Thr Pro Glu Asn Glu Gly Lys Trp Gly Ser Val Glu Glv 

115 120 125 

Thr Arg Asp Gin Tyr Asn Trp Ala Pro Leu Asp Arg lie Tyr Asp Tyr 

, 130 135 140 

Ala Arg Gin His Asn He Pro Val Lys Ala His Thr Leu Val Trp Gly 
1 ? 5 , 15 0 155 160 

Ala Gin Ala Pro Gly Trp He Asn Asn Leu Ser Ala Ala Glu Gin Arq 

165 170 175 

Glu Glu lie Glu Glu Trp lie Arg Asp Tyr cys Thr Arg Tyr Pro Asp 

180 185 190 

Thr Gin Met lie Asp Val Val Asn Glu Ala His Pro Asn His Ala Pro 

195 200 205 

Ala Arg Tyr Ala Gin Asn Ala Phe Gly Asn Asp Trp lie Thr Glu Ala 

. 210 215 220 

Phe Lys Leu Ala Arg Arg His cys Pro Asn Ala lie Leu He Tyr Asn 
225 230 235 240 

Asp Tyr Asn Phe lie Thr Trp Asp Thr Asp Glu He Met Ala Leu lie 

245 250 255 

Arg Pro Ala He Ala Ala Gly Val Val Asp Ala Val Gly Leu Gin Ala 

260 265 270 

His ser Leu Tyr Pro Asp Glu Tyr Ala Asn Lys Met Trp ser Ala Ala 

, n 275 280 285 

Glu rle Gin Gin Lys Leu Asp Leu lie Ser Thr Leu Gly Val pro Met 

290 295 300 

Tyr He ser Glu Tyr Asp val Ala Lys Ser Asn Asp Gin Glu Gin Leu 
305 310 315 320 

Ala He Phe Ser Glu Gin Phe Pro val Leu Tyr Glu His Pro Asn val 

., ., 325 330 335 

val Gly val Thr Leu Trp Gly Tyr lie Asp Gly Ala Thr Trp Arg Ala 

340 345 350 

Gly ser Gly Leu He Arg Asn Gly Gin His Arg Pro Ala Met Gin Trp 

355 360 365 

Leu Leu Glu Tyr Leu Glu Asn Asn Arg 
370 375 

<210> 275 
<211> 1401 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 275 

ttgggcgctg atccatttgc gctcacctat aacggaagag tgtacattta tatgtcgagt 60 

gatgactatg aatatcacag caatggaacg attaaggata attcttttgc caatttgaat 120 

agggtctttg tcatctcttc agcagatatg gtgaactgga cagatcatgg cgcgattcca 180 

gtagctgggg caaatggcgc aaatggcggc aaaggaattg ccaaatgggc aggtgcttcc 240 

tgggctccat cagcagcggt gaaaaaaatc aatgggaagg ataaattttt cctttatttc 300 

gcgaacagcg gcggagggat tggcgttctg acagcagact cccccatcgg tccatggaca 360 

gatcctatcg gaaaagcact cgtcacgcca aatacaccag ggatggctgg agttgtatgg 420 

ctttttgatc ctgccgtttt tgtagatgat gacggcactg gttatctata tgccggcgga 480 
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ggtgttccag gcggttctaa tccaacgcag ggacaatggg cgaatcctaa aacagcaaga 540 

gttctaaaac taggacctga tatgacaagt gtggtaggca gcgcatcaac cattgatgct 600 

ccttttatgt ttgaagattc ggggatgcat aagtataacg gaacctatta ctattcctat 660 

tgcatcaact ttggcggctc ccacccagca gataaaccac ctggtgagat cggttatatg 720 

acgagctcaa gtccgatggg tccctttacg tatagagggc acttcctgaa aaatccgggt 780 

gcatttttcg ggggaggcgg taataaccat catgctgtgt tcaattttaa aaacgagtgg 840 

tatgtcgtgt atcataccca aacggtcagc tctgctttat acggatcagg aaaaggctac 900 

agatctccgc atattaataa acttgtgcat aatgctgacg gctcccttcg agaagtcgca 960 

gccaattttg aaggggttaa acagctttcc aacctgaatc cttatcagcg tgtagaagct 1020 

gaaacattcg catggaatgg acgcattttg acagaggcat cttcagctcc aggcggaccg 1080 

gtcaataacc agcatgtcac aaacattcaa aacggagatt gggtggctgc cagtaacgtc 1140 

gatttcggat caaacggcgc gaggacattt aaagcgaatg tagcatcaaa tacaggcggg 1200 

aaaatagaag tacgcctcgg aagtccagac ggcagactcg tcggaacact gaatgtccct 1260 

tccacagggg gaacaaataa ctggcgagaa gtagaaacgg cagtaaatgg agcagcaggc 1320 

gtgcacaacg tattttttgt ttttactgga acaggtgcaa atctatttca atttgattcc 1380 

tggcagttta ctcaaaggta a ~ 1401 

<210> 276 
<211> 466 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 276 

Met Gly Ala Asp Pro Phe Ala Leu Thr Tyr Asn Gly Arg Val Tyr lie 

15 10 15 

Tyr Met Ser ser Asp Asp Tyr Glu Tyr His Ser Asn Gly Thr lie Lys 

20 25 30 

Asp Asn ser Phe Ala Asn Leu Asn Arg val Phe val lie ser ser Ala 

35 40 45 

Asp Met Val Asn Trp Thr Asp His Gly Ala He Pro Val Ala Gly Ala 

50 55 60 

Asn Gly Ala Asn Gly Gly Lys Gly lie Ala Lys Trp Ala Gly Ala Ser 
65 70 75 80 

Trp Ala Pro ser Ala Ala val Lys Lys lie Asn Gly Lys Asp Lys Phe 

, 85 90 95 

Phe Leu Tyr Phe Ala Asn Ser Gly Gly Gly lie Gly Val Leu Thr Ala 

100 105 110 

Asp Ser Pro lie Gly Pro Trp Thr Asp Pro lie Gly Lys Ala Leu Val 

115 120 125 

Thr pro Asn Thr Pro Gly Met Ala Gly Val Val Trp Leu Phe Asp Pro 

130 135 140 

Ala val Phe val Asp Asp Asp Gly Thr Gly Tyr Leu Tyr Ala Gly Gly 
145 150 lis 160 

Gly Val Pro Gly Gly Ser Asn Pro Thr Gin Gly Gin Trp Ala Asn Pro 

^ . 16 5 I 70 175 

Lys Thr Ala Arg Val Leu Lys Leu Gly Pro Asp Met Thr ser val val 

, 180 185 190 

Gly ser Ala Ser Thr lie Asp Ala Pro Phe Met Phe Glu Asp ser Gly 

195 200 205 

Met His Lys Tyr Asn Gly Thr Tyr Tyr Tyr ser Tyr cys lie Asn Phe 

210 ( 215 220 

Gly Gly Ser His Pro Ala Asp Lys Pro Pro Gly Glu lie Gly Tyr Met 
225 230 235 240 

Thr ser ser Ser Pro Met Gly Pro Phe Thr Tyr Arg Gly His Phe Leu 

245 250 * 255 

Lys Asn pro Gly Ala Phe Phe Gly Gly Gly Gly Asn Asn His His Ala 

260 265 270 

val Phe Asn Phe Lys Asn Glu Trp Tyr val val Tyr His Thr Gin Thr 

275 280 285 

Val ser ser Ala Leu Tyr Gly ser Gly Lys Gly Tyr Arg Ser Pro His 

290 295 300 

lie Asn Lys Leu Val His Asn Ala Asp Gly ser Leu Arg Glu Val Ala 
305 310 315 320 

Ala Asn Phe Glu Gly val Lys Gin Leu Ser Asn Leu Asn Pro Tyr Gin 

325 330 335 

Arg val Glu Ala Glu Thr Phe Ala Trp Asn Gly Arg lie Leu Thr Glu 

Page 200 



WO 03/106654 



PCT/US03/19153 



Asn val 
380 
Ala Ser 
395 

Gly Arg 
Asn Trp 
Asn val 



Ala ser ser Ala Pro Gly Gly Pro val Asn Asn Gin 
355 

lie Gin Asn Gly Asp Trp 
370 

Asn Gly Ala Arg Thr Phe 
385 390 
Lys He Glu Val Arg Leu 
405 

Leu Asn val Pro ser Thr 
420 

Thr Ala val Asn Gly Ala 
435 

Thr Gly Thr Gly Ala Asn Leu Phe Gin Phe Asp Ser 
450 455 460 

Gin Arg 
465 

<210> 277 
<211> 1128 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 



Gly Pro 
360 
val Ala 
375 

Lys Ala 

Gly ser 

Gly Gly 

Ala Gly 
440 
Leu Phe 
455 



345 

val Asn 

Ala Ser 

Asn val 

Pro Asp 
410 
Thr Asn 
425 

val His 
Gin Phe 



350 
His val 
365 

Asp Phe 

Asn Thr 

Leu val 

Arg Glu 
430 
Phe Phe 
445 

Trp Gin 



Thr Asn 

Gly ser 

Gly Gly 
400 
Gly Thr 
415 

Val Glu 
val Phe 
Phe Thr 



<400> 277 

atgcgaaaca 

aatcaggata 

gcgctgaatt 

cattttaacg 

ggtgtctatg 

ttcatggtgg 

gaaaatggcg 

accgttgttg 

aatgaagacg 

cttaaagcat 

tactcgcttg 

gaacatggtg 

gaactttctg 

attaccgaat 

aattttagcg 

attgaacagg 

gataaagtta 

tggcctgtgc 

cccgcttttt 



cattaatcct 
gagtaccatc 
ctgagcagat 
ccattacgcc 
attttaaaga 
gtcatacact 
cgttggtatc 
gccgctaccg 
gttcgtacag 
tcgaatttgc 
agaacccctc 
ctccgattac 
aaattgaaca 
tggatatcga 
cagagcttta 
aattggccaa 
cgcgagtgtc 
caggtcgtac 
tctcgattgt 



tttgattccg 
cctgcacgcc 
attgggtcgg 
cgaaaacatt 
ggctgatgca 
ggtttggcat 
gcgcgaggta 
tggacgtatt 
agaaacactg 
ccgggaggcc 
aaagagagcc 
tggggttgga 
gaccgtcatt 
tgtactgcct 
cgacgaactg 
tcgatatgcc 
tttttggggt 
caactatccg 
tgatgcagcc 



gccttgatga 
gagttctcgg 
gatacacgcg 
accaaatggg 
ttcgtcgatt 
agtcagacac 
ctgttagagc 
cacggctggg 
tggtaccaaa 
gatcccgacg 
ggcgcgatgc 
acccaggggc 
gattttgcct 
cagccagacg 
aacccatggc 
gacatcttcg 
gtcacagatg 
ctcatttttg 
agggaggcac 



tgctttcttg 
atgcattttt 
gactcgaatt 
aggctatcca 
ttggccaaaa 
cgcgctgggt 
ggatgcgcga 
atgtcgtaaa 
taattggtac 
ctgagctata 
gaattgttca 
atttcaccct 
cccttggtat 
attatactgg 
ccaacggcct 
aaatctattt 
gcgactcgtg 
atcgaaactg 
tggattaa 



cagtgcggga 
gattggaacg 
gattagaact 
tcccgaaccc 
atataatatg 
cttcaaagac 
ccacatccac 
cgaagccctc 
ggactatatt 
ctataacgat 
atacctgcag 
cgactggccc 
ggatgtaatg 
cgccgatgtg 
tccaccggaa 
gcgtcatcgt 
gaaaaataac 
gaagccaaaa 



<210> 278 
<211> 375 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> CI)... (19) 

<400> 278 

Met Arg Asn Thr Leu He Leu Leu lie Pro Ala Leu Met Met Leu Ser 

15 10 15 

cys ser Ala Gly Asn Gin Asp Arg Val Pro Ser Leu His Ala Glu Phe 

20 25 30 

Ser Asp Ala Phe Leu lie Gly Thr Ala Leu Asn ser Glu Gin lie Leu 

35 40 45 

Gly Arg Asp Thr Arg Gly Leu Glu Leu lie Arg Thr His Phe Asn Ala 

50 55 60 

He Thr Pro Glu Asn lie Thr Lys Trp Glu Ala lie His Pro Glu Pro 
65 70 75 80 
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Gly val Tyr Asp Phe Lys Glu Ala Asp Ala Phe val Asp Phe Gly Gin 

85 90 95 

Lys Tyr Asn Met Phe Met Val Gly His Thr Leu Val Trp His ser Gin 

100 105 110 

Thr Pro Arg Trp Val Phe Lys Asp Glu Asn Gly Ala Leu val Ser Arg 

115 120 125 

Glu val Leu Leu Glu Arg Met Arg Asp His lie His Thr val Val Gly 

130 135 140 

Arg Tyr Arg Gly Arg lie His Gly Trp Asp val val Asn Glu Ala Leu 
145 150 155 160 

Asn Glu Asp Gly Ser Tyr Arg Glu Thr Leu Trp Tyr Gin lie lie Gly 

165 170 175 

Thr Asp Tyr lie Leu Lys Ala Phe Glu Phe Ala Arg Glu Ala Asp Pro 

180 185 190 

Asp Ala Glu Leu Tyr Tyr Asn Asp Tyr Ser Leu Glu Asn Pro ser Lys 

195 200 205 

Arg Ala Gly Ala Met Arg lie Val Gin Tyr Leu Gin Glu His Gly Ala 

210 215 220 

Pro He Thr Gly Val Gly Thr Gin Gly His Phe Thr Leu Asp Trp Pro 
225 230 235 240 

Glu Leu ser Glu lie Glu Gin Thr Val lie Asp Phe Ala Ser Leu Gly 

245 250 255 

Met Asp val Met lie Thr Glu Leu Asp lie Asp val Leu Pro Gin Pro 

260 265 270 

Asp Asp Tyr Thr Gly Ala Asp Val Asn Phe Ser Ala Glu Leu Tyr Asp 

275 280 285 

Glu Leu Asn Pro Trp Pro Asn Gly Leu Pro Pro Glu lie Glu Gin Glu 

290 295 300 

Leu Ala Asn Arg Tyr Ala Asp lie Phe Glu He Tyr Leu Arg His Arg 
305 310 315 320 

Asp Lys Val Thr Arg val Ser Phe Trp Gly Val Thr Asp Gly Asp Ser 

325 330 335 

Trp Lys Asn Asn Trp Pro Val Pro Gly Arg Thr Asn Tyr Pro Leu lie 

340 345 350 

Phe Asp Arg Asn Trp Lys Pro Lys Pro Ala Phe Phe Ser lie Val Asp 

355 360 365 

Ala Ala Arg Glu Ala Leu Asp 
370 375 

<210> 279 
<211> 786 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 279 

atgctttctc cgacaaggaa actcccgccg gccattggac tcaccttcct cttcgccgct 60 

tcggcgacgc cggaaaccac gctcaaggac gccttcgcgg accattttct cgtcggggcg 120 

gcgctcaatg aatcgcactt tgcggagcac aatccggcgc acgccggtct cgtcgccgca 180 

aacttcaatg cgatcaccgc ggagaatgtg atgaaatggg aggccgttca tccccggccg 240 

ggagaatata cgttcggcgc cgcggaccgg ttcgttgagt tcggggaaaa gaacggcctg 300 

ttcatcgtgg ggcacacgct gatctggcat tctcaaacgc cggcctgggt tttcgaggat 360 

gagaatggcg cgccgctcgg ccgcgaggcg ctgctggagc ggatgcgcga tcacattcac 420 

accgttgccg gacgttacag gggccgtgtg aaggggtggg acgtggtcaa cgaagccctc 480 

gccgaggacg gttccctgcg ggattcgccg tggcgccgca tcataggcga cgactatttc 540 

gtgaaggcct ttgagtttgc gcgggaagct gatccggatg cggagttgta ttacaacgat 600 

tactcgattg aaaacgaacc gaagcgcaag ggggcggtgg cgttggtgag gacgctccag 660 

gcggcgggtg ttcccgttgc cggcgtgggg attcagggac acggcaatct ccattggcct 720 

tctccgcggc ttgtcgaaga ggcgatccgg gactttgcca gtttgggcgt caaggtgatg 780 

atctga ^ 786 

<210> 280 
<211> 261 
<212> PRT 
<213> unknown 

<220> 
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<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(22) 



<400> 280 


















Met 


Leu 


Ser 


Pro 


Thr 


Arg 


Lys 


Leu 


Pro 


Pro 


Ala lie Gly Leu Thr Phe 


1 


Phe 






5 






10 


15 


Leu 


Ala 


Ala 


Ser 


Ala 


Thr 


Pro 


Glu 


Thr 


Thr Leu Lys Asp Ala Phe 


Ala 






20 










25 




30 


Asp 


His 


Phe 


Leu 


val 


Gly 


Ala 


Ala 


Leu 


Asn Glu Ser His Phe Ala 


Glu 


35 








40 






45 


His 


Asn 


Pro 


Ala 


His 


Ala 


Gly 


Leu 


Val 


Ala Ala Asn Phe Asn Ala 


He 


50 










55 






60 


Thr 


Ala 


Glu 


Asn 


Val 


Met 


Lys 


Trp 


Glu 


Ala Val His Pro Arg Pro 


65 










70 






75 80 


Gly 


Glu 


Tyr 


Thr 


Phe 


Gly 


Ala 


Ala 


ASP 


Arg 


Phe val Glu Phe Gly Glu 








85 






90 


95 


Lys 


Asn 


Gly 


Leu 


Phe 


lie 


val 


Gly 


His 


Thr 


Leu lie Trp His Ser Gin 




100 








105 




110 


Thr 


Pro 


Ala 


Trp 


Val 


Phe 


Glu 


Asp 


Glu 


Asn 


Gly Ala Pro Leu Gly Arg 


Glu 




115 










120 






125 


Ala 


Leu 


Leu 


Glu 


Arg 


Met 


Arg 


Asp 


His 


lie His Thr val Ala Gly 




130 








135 




140 


Arg 


Tyr 


Arg 


Gly 


Arg 


Val 


Lys 


Gly 


Trp 


Asp 


Val Val Asn Glu Ala Leu 


145 


Glu 




Gly 




150 






155 160 


Ala 


Asp 


ser 
165 


Leu 


Arg 


Asp 


Ser 


Pro 
170 


Trp Arg Arg lie He Gly 
175 


Asp 


Asp 


Tyr 


Phe 


Val 


Lys 


Ala 


Phe 


Glu 


Phe 


Ala Arg Glu Ala Asp Pro 




Ala 




180 










185 




190 


Asp 


Glu 


Leu 


Tyr 


Tyr 


Asn 


Asp 


Tyr 


ser 


lie Glu Asn Glu Pro Lys 




195 






200 




205 


Arg 


Lys 


Gly 


Ala 


val 


Ala 


Leu 


Val 


Arg 


Thr 


Leu Gin Ala Ala Gly Val 


210 










215 






220 


Pro 


Val 


Ala 


Gly 


val 


Gly 


He 


Gin 


Gly 


His 


Gly Asn Leu His Trp Pro 


225 










230 








235 240 


Ser 


Pro 


Arg 


Leu 


val 
245 


Glu 


Glu 


Ala 


lie 


Arg 
250 


Asp Phe Ala Ser Leu Gly 
255 


val 


Lys 


val 


Met 


He 


















260 

















<210> 281 
<211> 963 
<212> DNA 
<213> Unknown 



<220> 

<223> obtained from an environmental sample. 
<400> 281 

gtgggcacct gcatgagcgg ggccgattcg cgcaaccctg cccgtctgga gctgatcaga 60 

acgcagtaca gcatcatcac cccggaaaac gagctcaagc ccgattccgt tctggatgtg 120 

gctgccagcc gtgctctggc caaggaggac gataccgccg tggccgtgca tttcagcgcc 180 

gccgctccca tcctgaactt tgcccgtgac aacggcatca aggtgcacgg tcatgtgctg 240 

gtctggcaca gccagactcc cgaggagttc ttccacgagg gctataacgc ctccgcgccc 300 

tatgtgagcc gcgaggtgat gctggcccgt ctggacaact acatccgtct catctttgaa 360 

tatatggatg aaaactatcc cggcctgatc gtgtcctggg atgtggccaa cgaatgcgtg 420 

gccgacggct ccaccgccct gcgcacctcc aactggaccc gcgtggtggg gcaggatttt 480 

gtggcccgcg ccttcgagat cgccgataaa tacgcgcccg aagatgtgat gctctgctac 540 

aacgattatt ccactcccta tgagcccaag ctcaccggca tcgtgaacct gctcaccgag 600 

ctgacacagg agggtcatat cgacggctac ggcttccaga gccactacag tgtcggcgat 660 

ccctccctgc aggcggtcga gaacgcgttc aaaaagatct ccgccctggg gctcaagctg 720 

cgcgtgagcg agctggacat caaggtagat gccgacagcg agcccaaccg cgcccttcag 780 

gccgaccggt atgaggccct gctgcgcatc tatatgaaat acggcgtcag cgccgtgcag 840 

gtgtggggcg tatgcgacgg caccagctgg atcggcgcga gctatcccct cccctttgac 900 

gccgggctgc gtcccaagcc ctccttcttc ggcatactcc gcgcccttga cgaacagaac 960 

tga ~ ~ ' ■ * 963 

<210> 282 
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<211> 320 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 282 

Met Gly Thr cys Met Ser Gly Ala Asp ser Arg Asn Pro Ala Arg Leu 

1 , 5 10 " 15 

Glu Leu lie Arg Thr Gin Tyr ser He lie Thr Pro Glu Asn Glu Leu 

20 25 30 

Lys Pro Asp Ser Val Leu Asp Val Ala Ala Ser Arg Ala Leu Ala Lys 

35 40 45 

Glu Asp Asp Thr Ala val Ala Val His Phe ser Ala Ala Ala Pro He 

50 55 60 

Leu Asn Phe Ala Arg Asp Asn Gly lie Lys val His Gly His val Leu 
65^ 70 75 80 

Val Trp His Ser Gin Thr Pro Glu Glu Phe phe His Glu Gly Tyr Asn 

85 90 7 95 

Ala Ser Ala Pro Tyr val ser Arg Glu val Met Leu Ala Arg Leu Asp 

100 105 110 

Asn Tyr lie Arg Leu lie Phe Glu Tyr Met Asp Glu Asn Tyr Pro Gly 

115 120 125 

Leu He val ser Trp Asp val Ala Asn Glu cys val Ala Asp Gly ser 

130 135 140 

Thr Ala Leu Arg Thr Ser Asn Trp Thr Arg val Val Gly Gin Asp Phe 
145 150 y 155 160 

val Ala Arg Ala Phe Glu lie Ala Asp Lys Tyr Ala Pro Glu Asp Val 

165 170 175 

Met Leu cys Tyr Asn Asp Tyr ser Thr Pro Tyr Glu Pro Lys Leu Thr 

, , , 180 185 190 

Gly lie Val Asn Leu Leu Thr Glu Leu Thr Gin Glu Gly His lie Asp 

195 200 205 

Gly Tyr Gly Phe Gin Ser His Tyr Ser val Gly Asp Pro ser Leu Gin 

, 210 215 220 

Ala val Glu Asn Ala Phe Lys Lys He ser Ala Leu Gly Leu Lys Leu 
225 230 235 240 

Arg Val ser Glu Leu Asp lie Lys val Asp Ala Asp Ser Glu Pro Asn 

, 245 250 255 

Arg Ala Leu Gin Ala Asp Arg Tyr Glu Ala Leu Leu Arg lie Tyr Met 

, 260 265 270 

Lys Tyr Gly Val Ser Ala Val Gin Val Trp Gly Val Cys Asp Gly Thr 

275 280 285 

ser Trp lie Gly Ala ser Tyr Pro Leu Pro Phe Asp Ala Gly Leu Arg 

290 295 300 

Pro Lys Pro Ser Phe phe Gly lie Leu Arg Ala Leu Asp Glu Gin Asn 
305 310 315 320 

<210> 283 
<211> 4161 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 283 

atgtataaaa gtttcgtcaa gaaagtctcc cttgtattat ctactctttt gctcttagtt 60 

tcggcgtttc ctgtctcata tgcacaaatg aattccatcc ccgtttatga agaaacgttt 120 

gaaaaccaag gaaactatgt ccaatctggt ggtgcgaccc tcactctagt aaaaaacaaa 180 

gtgtttgcag ggaatgaaga tggaactgca ctatatatta gtaatcgatc gaataactgg 240 

gacggggcag atttccgttt cacggatctt ggattacaag atggaaaaac atatacgatc 300 

aatattatag gatatgtcga tgaaaatgaa gttgttcctt caggagccca agtgtatttg 360 

caaactgtag ataaaacata tggatggtta gcaagcgcgg acttaaaaaa cggagagtcg 420 

ttcactataa atacaacgtt cacccttgac atgagtaaag gggacacccg tcttcgtata 480 

caatccaacg atagtggtaa aaaagtttca ttttacgtcg ggtatttttc aatttcaatt 540 

agtgatgtag aaggagaaga tggtgggagc tctatttcaa ggccaccggc tttacctttt 600 

gaaactattg actttgaaga tcaaagttta agtggatttg agggacgagc aggcacggaa 660 
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acattgaccg ttacgaatga agcaaataga actcctggag gatcttatgc actaaaagtg 720 

gaaaatagat ctcaaaattg gcatggacct tccttacgca tcgagaaata tattgattta 780 

ggttacgaat atacaatttc tctatgggtt aaacttattt cacccacaag tgcacaaatt 840 

cagctttcta cccaagtcgg aagtggaagt ggtgcgagtt ataacaatat tttaagtaaa 900 

gtaattagtg ttgatgatgg atgggtactg tatgaaggaa agtatcgcta caatagttcg 960 

ggaggggaat atttaacaat ttacgtagaa agcccaaaca atagtactgc atctttttac 1020 

atcgatgata ttcgtttaat aaagagtgga gacccaatct ctgtacaaaa agatcttctc 1080 

cctatcaaga gtgtttatga aggtgacttc ttagttggta gtgccgtatc agcgactgat 1140 

ttagagggag agagactcga gcttctcaag ttgcattaca atagcataac agcggaaaac 1200 

gccatgaaac ctagctattt acaacctact aaaggaaact ttaccttcga agcagcagat 1260 

agtattgtaa ataaagccct agaagaagga atgaaagtac atggacatgt tctcgtatgg 1320 

catcagcaga cacctgaatg gatgaccact agagaagatg gaagccctct cggcagggaa 1380 

gaagcgttag aaaatctaaa aaatcacatt gaaacagtta tgaaacattt tggtgataga 1440 

gtaatttcat gggatgttgt caatgaagct atcattgata atccacctaa tcctgataat 1500 

tgggaggaat cattaagaaa atcaccatgg tactattcaa tcggttctga ttatgttgag 1560 

caagcattcc gaattgcacg acaagttttg gacgaaaatg ggtgggatat taagctatat 1620 

tacaatgatt acaatgaaga taatcaaaga aaagcacaag ccatttacca tatggtaaaa 1680 

gagcttaatg aaaaatatgc acaagagcat cctggtaaaa gattaatcga tggaattgga 1740 

atgcaagggc attacagtat acgaacaaat ccagataatg tgaaaatgtc attagaaaga 1800 

tttatttccc ttggtgtgga agttagtatt actgaactcg atattcaagc tggaacggat 1860 

aatcatctta cggaagaaca gtcaaaagca caagcatatt tatacgctaa attattcaaa 1920 

atattcaaag aaaatgcatc gcatatctcc cgagttacgt tatggggatt aaatgatgcg 1980 

gcaagttggc gtgcgtcaac aagtccattg ttatttgatc gaaatttaca ggccaaacca 2040 

agttactatg cggtaattga tcctgataca tttatagaag aaaatcctac tgtgacagaa 2100 

gagtcgcgga aagcaattgc tttgtatggt atccctgtaa ttgatggaag catcgattcc 2160 

atttgggaaa gtgttcctta catccctatt gatcgttacc aaatggcgtg gcaaggagca 2220 

agcggaactg ctaaagttct ttgggacgaa gggaatctgt atgtattagt acaagttaac 2280 

gatgaccagt tagataagtc gagtacaaat ccttgggagc aggattcgat agaggtgttt 2340 

gtggacgaaa ataatgcaaa aacatcgttt taccaagagg atgatggaca atatagagta 2400 

aactttgaca acgaaacatc gtttaatcca ccaagcattg aaaatggatt tatgtccgaa 2460 

actaatgtat ctgggactaa ttatgtggtg gagatgaaaa tccctttaag aagtatacaa 2520 

ctaaaaaatg ggtctgaaat agggtttgat gttcaaatta atgatgggaa aaatggtgct 2580 

cgtcagagtg tggctgcttg gaatgatacg actggaactg catatatgga tacatctgta 2640 

ttcgggacac ttactctttt aaccacttta gataatgaaa atacaccagg cagcggcaca 2700 

acaccaggta gtggcacaac accaggcagt ggcacaacac caggcagcag cacaacacca 2760 

ggcagcggta caacaccagg tagcggcaca acaccaggca gtggcacaac accaggcagc 2820 

ggcacaacac caggcagtgg cacaacacca ggcagtggca caacaccggg cagcggtaca 2880 

acaccaggca gtggcacaac accaggcagt ggtacaacac cgggcagtgg cacaacacca 2940 

gtgaagggtg aaaatggtac ggttgtttta cagccgaaag tagagacgaa agaaaaagac 3000 

ggcaaagtag tagaaaaagt ggcaactatt tcaacaaatg aagttgaagc gattgtcaag 3060 

gagctgtcga atgaaaataa acaagtcgtc gtctccctcg gctcgcttcc aaaaggtgta 3120 

gccacaaaag tagatgtgcc agctacatta tttacacaag cggcaaataa acaagcagaa 3180 

gcaacgattg tgagcgccag tgaacaagcg acgtacaaat tgccagtcaa agaagtgcag 3240 

gcgtctcttg cgacgattgc ccggtcactc ggtgcaacga tagaacaagt tagcatctcg 3300 

attgaaatga aagtgaacga tgcgccgtca ctacgtgtga aaccgttgtc tgatgcggta 3360 

gagtttcatg tcgtggcgaa agcaaatgga aaggaacgcg tcatcgatcg gtttactcaa 3420 

tatgtcgaac gcgaaatcgc gttaaagcaa tcggtcaacg ctagtcgtgc cattgcagtg 3480 

cgcgtgaacg atgacggttc acttacccca gtaccgacaa cgtttgttgg caacaaagca 3540 

gtcattaaat cgttgacgaa ctcgacgtat gttgttgtgg aaggaacaca tacatttagt 3600 

gacatccaac cacattgggc gaaaggttat attgaaacac tcgcggcaaa acagcttgtc 3660 

aaagggatga cggacacaac atatcgacca aatgatcgga tgacgcgcgc gcaatttgcg 3720 

gtgttgctcg tacgggcgct aggattgccg agcgaaacgt atgacggtcg ctttgctgat 3780 

gtgaagggaa cggagtggtt taacaagaac ggtgaattag cagcggcagt caagttcgga 3840 

atcattcaag gaaaaacagc ttatatgttt gcgccgaatg agccaatcac tcgcgcacaa 3900 

gcagctgtca tgatcgaacg ggcattgaaa ctttcgatcg ttggctatga tgaggcaaca 3960 

agcgacaaaa cgaaaaaagt gacagatttc cgcgatgcaa aacaattgcc aacatgggca 4020 

aaacaggcga ttgaagcagt ataccaagca ggcatcatgc aaggacgaga tagcggaaac 4080 

tttgatccga caagccatgt gacgcgtgcc gaaatggcga aggtgttaat ggatatttta 4140 
gagttgacaa aacttattta a * 

<210> 284 
<211> 1386 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<221> SIGNAL 
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<222> (1)...(28) 
<400> 284 

Met Tyr Lys ser Phe val Lys Lys val ser Leu val Leu ser Thr Leu 
1 5 10 15 

Leu Leu Leu val ser Ala Phe Pro Val Ser Tyr Ala Gin Met Asn Ser 

20 25 30 

lie Pro val Tyr Glu Glu Thr Phe Glu Asn Gin Gly Asn Tyr Val Gin 

35 40 45 

ser Gly Gly Ala Thr Leu Thr Leu Val Lys Asn Lys Val Phe Ala Gly 

50 55 60 

Asn Glu Asp Gly Thr Ala Leu Tyr lie ser Asn Arg Ser Asn Asn Trp 
65 70 75 80 

Asp Gly Ala Asp Phe Arg Phe Thr Asp Leu Gly Leu Gin Asp Gly Lys 

85 90 95 

Thr Tyr Thr lie Asn lie lie Gly Tyr val Asp Glu Asn Glu val val 

100 105 110 

Pro ser Gly Ala Gin Val Tyr Leu Gin Thr Val Asp Lys Thr Tyr Gly 

115 120 125 

Trp Leu Ala ser Ala Asp Leu Lys Asn Gly Glu Ser phe Thr lie Asn 

130 135 140 

Thr Thr Phe Thr Leu Asp Met Ser Lys Gly Asp Thr Arg Leu Arg lie 
145 150 155 160 

Gin ser Asn Asp Ser Gly Lys Lys Val ser Phe Tyr Val Gly Tyr Phe 

165 170 175 

Ser lie ser lie ser Asp val Glu Gly Glu Asp Gly Gly ser ser lie 

180 185 190 

Ser Arg pro Pro Ala Leu Pro Phe Glu Thr lie Asp Phe Glu Asp Gin 

195 200 205 

Ser Leu Ser Gly Phe Glu Gly Arg Ala Gly Thr Glu Thr Leu Thr Val 

210 215 220 

Thr Asn Glu Ala Asn Arg Thr Pro Gly Gly Ser Tyr Ala Leu Lys Val 
225 230 235 240 

Glu Asn Arg Ser Gin Asn Trp His Gly pro ser Leu Arg He Glu Lys 

245 250 255 

Tyr lie Asp Leu Gly Tyr Glu Tyr Thr lie Ser Leu Trp val Lys Leu 

260 265 270 

lie Ser Pro Thr ser Ala Gin lie Gin Leu Ser Thr Gin Val Gly Ser 

275 280 285 

Gly Ser Gly Ala Ser Tyr Asn Asn lie Leu ser Lys Val lie Ser Val 

290 295 300 

Asp Asp Gly Trp val Leu Tyr Glu Gly Lys Tyr Arg Tyr Asn ser ser 
305 310 315 320 

Gly Gly Glu Tyr Leu Thr lie Tyr val Glu ser Pro Asn Asn Ser Thr 

325 330 335 

Ala Ser Phe Tyr lie Asp Asp lie Arg Leu lie Lys Ser Gly Asp Pro 

340 345 350 

lie ser Val Gin Lys Asp Leu Leu Pro lie Lys Ser Val Tyr Glu Gly 

355 360 365 

Asp Phe Leu val Gly ser Ala val ser Ala Thr Asp Leu Glu Gly Glu 

370 375 380 

Arg Leu Glu Leu Leu Lys Leu His Tyr Asn ser lie Thr Ala Glu Asn 
385 390 395 400 

Ala Met Lys Pro Ser Tyr Leu Gin Pro Thr Lys Gly Asn Phe Thr Phe 

405 410 415 

Glu Ala Ala Asp Ser lie Val Asn Lys Ala Leu Glu Glu Gly Met Lys 

420 425 430 

val His Gly His Val Leu Val Trp His Gin Gin Thr Pro Glu Trp Met 

435 440 445 

Thr Thr Arg Glu Asp Gly Ser Pro Leu Gly Arg Glu Glu Ala Leu Glu 

450 455 460 

Asn Leu Lys Asn His lie Glu Thr Val Met Lys His Phe Gly Asp Arg 
465 470 475 480 

Val lie Ser Trp Asp val val Asn Glu Ala lie lie Asp Asn Pro Pro 

485 490 495 

Asn Pro Asp Asn Trp Glu Glu Ser Leu Arg Lys Ser pro Trp Tyr Tyr 

500 505 510 

Ser lie Gly Ser Asp Tyr Val Glu Gin Ala Phe Arg lie Ala Arg Gin 
515 520 y 525 
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Val Leu Asp Glu Asn Gly Trp Asp He Lys Leu Tyr Tyr Asn Asp Tyr 

530 535 540 

Asn Glu Asp Asn Gin Arg Lys Ala Gin Ala lie Tyr His Met Val Lys 
545 550 555 560 

Glu Leu Asn Glu Lys Tyr Ala Gin Glu His Pro Gly Lys Arg Leu lie 

565 570 575 

Asp Gly lie Gly Met Gin Gly His Tyr ser lie Arg Thr Asn Pro Asp 

580 585 590 

Asn Val Lys Met Ser Leu Glu Arg Phe lie Ser Leu Gly val Glu Val 

595 600 605 

Ser lie Thr Glu Leu Asp lie Gin Ala Gly Thr Asp Asn His Leu Thr 

610 615 620 

Glu Glu Gin Ser Lys Ala Gin Ala Tyr Leu Tyr Ala Lys Leu Phe Lys 
625 630 635 640 

He Phe Lys Glu Asn Ala ser His He ser Arg val Thr Leu Trp Gly 

645 650 655 

Leu Asn Asp Ala Ala ser Trp Arg Ala Ser Thr Ser Pro Leu Leu Phe 

660 665 670 

Asp Arg Asn Leu Gin Ala Lys Pro Ser Tyr Tyr Ala Val lie Asp Pro 

675 680 685 

Asp Thr Phe lie Glu Glu Asn Pro Thr Val Thr Glu Glu Ser Arg Lys 

690 695 700 

Ala lie Ala Leu Tyr Gly lie Pro val lie Asp Gly Ser lie Asp Ser 
705 710 715 720 

lie Trp Glu Ser Val Pro Tyr lie Pro lie Asp Arg Tyr Gin Met Ala 

725 730 735 

Trp Gin Gly Ala Ser Gly Thr Ala Lys Val Leu Trp Asp Glu Gly Asn 

740 745 750 

Leu Tyr val Leu Val Gin val Asn Asp Asp Gin Leu Asp Lys Ser Ser 

755 760 765 

Thr Asn pro Trp Glu Gin Asp Ser lie Glu Val Phe val Asp Glu Asn 

770 775 780 

Asn Ala Lys Thr Ser Phe Tyr Gin Glu Asp Asp Gly Gin Tyr Arg val 
785 790 795 800 

Asn Phe Asp Asn Glu Thr Ser Phe Asn Pro Pro Ser lie Glu Asn Gly 

805 810 815 

Phe Met ser Glu Thr Asn val Ser Gly Thr Asn Tyr val Val Glu Met 

820 825 830 

Lys lie Pro Leu Arg Ser lie Gin Leu Lys Asn Gly ser Glu lie Gly 

835 840 845 

Phe Asp Val Gin lie Asn Asp Gly Lys Asn Gly Ala Arg Gin ser Val 

850 855 860 

Ala Ala Trp Asn Asp Thr Thr Gly Thr Ala Tyr Met Asp Thr ser val 
865 870 875 880 

Phe Gly Thr Leu Thr Leu Leu Thr Thr Leu Asp Asn Glu Asn Thr Pro 

885 890 895 

Gly Ser Gly Thr Thr Pro Gly ser Gly Thr Thr Pro Gly ser Gly Thr 

900 905 910 

Thr Pro Gly Ser Ser Thr Thr Pro Gly ser Gly Thr Thr Pro Gly Ser 

915 920 925 

Gly Thr Thr Pro Gly Ser Gly Thr Thr Pro Gly Ser Gly Thr Thr Pro 

930 935 940 

Gly ser Gly Thr Thr pro Gly Ser Gly Thr Thr Pro Gly ser Gly Thr 
945 950 955 960 

Thr Pro Gly Ser Gly Thr Thr Pro Gly ser Gly Thr Thr Pro Gly Ser 

965 970 975 

Gly Thr Thr Pro val Lys Gly Glu Asn Gly Thr Val Val Leu Gin Pro 

980 985 990 

Lys val Glu Thr Lys Glu Lys Asp Gly Lys Val Val Glu Lys Val Ala 

995 1000 1005 

Thr lie ser Thr Asn Glu Val Glu Ala lie val Lys Glu Leu ser Asn 

1010 1015 1020 

Glu Asn Lys Gin val Val Val Ser Leu Gly ser Leu Pro Lys Gly Val 
1025 1030 1035 1040 

Ala Thr Lys Val Asp val Pro Ala Thr Leu Phe Thr Gin Ala Ala Asn 

1045 1050 1055 

Lys Gin Ala Glu Ala Thr lie val Ser Ala Ser Glu Gin Ala Thr Tyr 

1060 1065 1070 

Lys Leu Pro val Lys Glu Val Gin Ala ser Leu Ala Thr lie Ala Arg 
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1075 1080 1085 

Ser Leu Gly Ala Thr lie Giu Gin val Ser lie Ser lie Giu Met Lys 

1090 1095 1100 

Val Asn Asp Ala Pro ser Leu Arg Val Lys Pro Leu Ser Asp Ala Val 
1105 1110 1115 1120 

Giu Phe His Val Val Ala Lys Ala Asn Gly Lys Giu Arg Val lie Asp 

1125 1130 ~ 1135 

Arg Phe Thr Gin Tyr Val Giu Arg Giu lie Ala Leu Lys Gin Ser Val 

1140 1145 1150 

Asn Ala ser Arg Ala lie Ala val Arg Val Asn Asp Asp Gly Ser Leu 

1155 1160 1165 

Thr Pro val Pro Thr Thr Phe Val Gly Asn Lys Ala val lie Lys ser 

1170 1175 1180 

Leu Thr Asn ser Thr Tyr Val Val Val Giu Gly Thr His Thr Phe ser 
1185 1190 1195 1200 

Asp lie Gin Pro His Trp Ala Lys Gly Tyr He Giu Thr Leu Ala Ala 

1205 1210 1215 

Lys Gin Leu val Lys Gly Met Thr Asp Thr Thr Tyr Arg Pro Asn Asp 

1220 1225 1230 

Arg Met Thr Arg Ala Gin Phe Ala Val Leu Leu val Arg Ala Leu Gly 

1235 1240 1245 

Leu Pro ser Giu Thr Tyr Asp Gly Arg Phe Ala Asp Val Lys Gly Thr 

1250 1255 1260 

Giu Trp Phe Asn Lys Asn Gly Giu Leu Ala Ala Ala Val Lys Phe Gly 
1265 1270 1275 1280 

lie lie Gin Gly Lys Thr Ala Tyr Met Phe Ala Pro Asn Giu pro lie 

1285 1290 1295 

Thr Arg Ala Gin Ala Ala Val Met lie Giu Arg Ala Leu Lys Leu Ser 

1300 1305 1310 

lie val Gly Tyr Asp Giu Ala Thr Ser Asp Lys Thr Lys Lys Val Thr 

1315 1320 1325 

Asp Phe Arg Asp Ala Lys Gin Leu Pro Thr Trp Ala Lys Gin Ala lie 

1330 1335 1340 

Giu Ala Val Tyr Gin Ala Gly lie Met Gin Gly Arg Asp ser Gly Asn 
1345 1350 1355 1360 

Phe Asp Pro Thr Ser His Val Thr Arg Ala Giu Met Ala Lys val Leu 

1365 1370 1375 

Met Asp lie Leu Giu Leu Thr Lys Leu He 
1380 1385 

<210> 285 
<211> 1569 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 285 

gtgaaccgaa acacatctcc ggacttcaag cttggagcag gcgttacggt aagtgatttc 60 

ctgaacaagg gcaggcaata tgaggtcgct gtggcgaatc ttgatgaaat gacggccgga 120 

aatgccatga agcagtcttc tgtaatgagg cctgacggat ccatggattt cactcaggtc 180 

agaagattca tcgaggaggc cgaacgtgtc ggaatgacag tgtacggcca tacattggca 240 

tggcattcac agcagcagaa cgcctatctt aacggtctga tcaagggcaa gaagaccgag 300 

gtcgagccag gccaggagtc agaggtcgtt cttctccaga cagatttcaa tgacggaaat 360 

gtcacattca acggatgggg aaacaattct tcaaggactg tcgagaatgg tgcattaaag 420 

cttacaaacc cttctgtagt aaacagttgg gaggcccagt tcgcatatga tttttcagag 480 

gccttcgaga tggacaagac atataagctc aagttcagga tcaagggctc ggctgcagga 540 

aagatcgcgg caggcttcca gatcactgac ggctaccttt cggcaggtga gttcggaacc 600 

gtagagttca atacccagtg gaaggatgtc gagctctcat gcgtatgttc cgctgaaggc 660 

ggtacacgct tgatcttcag tttcggagag tttgccggag atatctatat cgacgatttc 720 

tgcttcagtg tggaaggagc tggatatatc tacgaagatc ttaccccggc agagaagaag 780 

gagcgcctta ccgaggcaat ggaccgttgg atcaagggaa tgatggaggt taccgctacc 840 

agggtttctg cgtgggatgc tgtcaatgag gcgatttccg gccgtgatac aaatggcgac 900 

ggcttctatg aacttgagtc ggcacaatgg ggaagctcga acaacttcta ttggcaggat 960 

tatctcggct caggagatta tgtgcgtatc gtgatcgcaa aggcccgcaa gtattatgag 1020 

gaattcggcg gtacggctcc tttgagactc ttcatcaatg actacaacct cgaatctgac 1080 

tgggatgaca acaagaagct caagagcctt atccattgga tcggtgtctg ggagtctgac 1140 

ggagtgacaa agatcgacgg aatcggtacc cagatgcacg tttcgtatta cgagaatcct 1200 
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gatattcagg caagcaagga gaaacattat gtgcagatgc ttcagcttat ggcaaataca 1260 

ggaaagctcg tgaagatctc cgagcttgat atgggctatg tagaccgcaa cggaaatact 1320 

gtgggaacag cggacatgac cgaccagcag catagggcca tggcggatta ttatgacttc 1380 

atcgtgcgca agtactttga gatcgtgcct cctgcacagc agtatggcat cacgcagtgg 1440 

tgcatgacgg atgctcccgg agctatcggc acaggctgga gaggcggtga gcctgtgggc 1500 

ctgtgggacc agaattacaa ccgcaagtat gcgtacgcag gatttgcaaa cggacttaga 1560 

gcgaaataa , ^ 15gg 

<210> 286 
<211> 522 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 286 

Met Asn Arg Asn Thr Ser Pro Asp Phe Lys Leu Gly Ala Gly Val Thr 

1 5 10 15 

Val Ser Asp Phe Leu Asn Lys Gly Arg Gin Tyr Glu Val Ala val Ala 

20 25 30 

Asn Leu Asp Glu Met Thr Ala Gly Asn Ala Met Lys Gin Ser Ser val 

35 40 45 

Met Arg Pro Asp Gly Ser Met Asp Phe Thr Gin val Arg Arg Phe lie 

50 55 60 

Glu Glu Ala Glu Arg Val Gly Met Thr Val Tyr Gly His Thr Leu Ala 
65 70 75 80 

Trp His ser Gin Gin Gin Asn Ala Tyr Leu Asn Gly Leu lie Lys Gly 

85 90 95 

Lys Lys Thr Glu Val Glu Pro Gly Gin Glu ser Glu Val val Leu Leu 

100 105 110 

Gin Thr Asp Phe Asn Asp Gly Asn Val Thr Phe Asn Gly Trp Gly Asn 

115 120 125 

Asn ser ser Arg Thr val Glu Asn Gly Ala Leu Lys Leu Thr Asn Pro 

130 135 140 

Ser val Val Asn Ser Trp Glu Ala Gin Phe Ala Tyr Asp Phe Ser Glu 
145 150 155 160 

Ala Phe Glu Met Asp Lys Thr Tyr Lys Leu Lys Phe Arg lie Lys Gly 

165 170 175 

ser Ala Ala Gly Lys lie Ala Ala Gly Phe Gin He Thr Asp Gly Tyr 

180 185 190 

Leu ser Ala Gly Glu phe Gly Thr val Glu Phe Asn Thr Gin Trp Lys 

195 200 205 

Asp val Glu Leu Ser Cys val Cys Ser Ala Glu Gly Gly Thr Arg Leu 

210 215 220 

He Phe Ser Phe Gly Glu Phe Ala Gly Asp lie Tyr lie Asp Asp Phe 
225 230 235 240 

Cys Phe Ser Val Glu Gly Ala Gly Tyr lie Tyr Glu Asp Leu Thr Pro 

^ n 245 250 255 

Ala Glu Lys Lys Glu Arg Leu Thr Glu Ala Met Asp Arg Trp lie Lys 

260 265 270 

Gly Met Met Glu Val Thr Ala Thr Arg val ser Ala Trp Asp Ala Val 

275 280 285 

Asn Glu Ala lie Ser Gly Arg Asp Thr Asn Gly Asp Gly Phe Tyr Glu 

290 295 300 

Leu Glu ser Ala Gin Trp Gly Ser Ser Asn Asn Phe Tyr Trp Gin Asp 
305 310 315 320 

Tyr Leu Gly Ser Gly Asp Tyr val Arg lie val He Ala Lys Ala Arg 

„ 3 ?5 330 335 

Lys Tyr Tyr Glu Glu Phe Gly Gly Thr Ala Pro Leu Arg Leu Phe lie 

340 345 350 

Asn Asp Tyr Asn Leu Glu ser Asp Trp Asp Asp Asn Lys Lys Leu Lys 

355 360 365 

Ser Leu lie His Trp lie Gly val Trp Glu Ser Asp Gly Val Thr Lys 

370 375 380 

lie Asp Gly lie Gly Thr Gin Met His val ser Tyr Tyr Glu Asn Pro 
385 390 395 400 

Asp lie Gin Ala Ser Lys Glu Lys His Tyr val Gin Met Leu Gin Leu 
405 410 415 
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Met Ala Asn 

Tyr Val Asp 
435 

Gin Gin His 

450 
Tyr Phe Glu 
465 

Cys Met Thr 

Glu Pro Val 

Ala Gly Phe 
515 



Thr Gly 
420 

Arg Asn 

Arg Ala 

lie Val 

Asp Ala 
485 
Gly Leu 
500 

Ala Asn 



Lys Leu 

Gly Asn 

Met Ala 
455 
pro pro 
470 

Pro Gly 
Trp Asp 
Gly Leu 



val Lys 
425 
Thr Val 
440 

Asp Tyr 

Ala Gin 

Ala lie 

Gin Asn 
505 
Arg Ala 
520 



lie ser Glu 

Gly Thr Ala 

Tyr Asp Phe 
460 

Gin Tyr Gly 

475 
Gly Thr Gly 
490 

Tyr Asn Arg 
Lys 



Leu Asp 
430 
Asp Met 
445 

lie Val 

lie Thr 

Trp Arg 

Lys Tyr 
510 



Met Gly 

Thr Asp 

Arg Lys 

Gin Trp 
480 
Gly Gly 
495 

Ala Tyr 



<210> 287 
<211> 1695 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 



<400> 287 

atgactattc 

gcgcatgcgc 

tcaggttatt 

acggttggcc 

ggggttacgg 

tttggtttct 

ggcggtggcg 

gttgccggcg 

gtgagcctgg 

gactacaccg 

ggtgaccgcg 

gatcaggaag 

atgctgcact 

tccagctcca 

aatttccccg 

atgcagtatt 

actcgcgacc 

ggtattcctg 

ggcttgagtg 

cgctacccgg 

gcaaactatg 

cgtcagtact 

accgatgcca 

ggtcttcaag 

aaattggatc 

gcaacgaatg 

catcccaacg 

ggcactggcc 

tatctaaacc 



atctacaaaa 
tcacctccgg 
gcgctaacgt 
tgaaccttaa 
ttacccccgt 
gtgctaatgg 
cagccagtag 
gcagcaacag 
aaatcggtgg 
tacagaccaa 
atgtcgaagt 
ataacaccgg 
gcaacggttc 
gcaacacttc 
atttcttcgt 
gggatcaaat 
agtacaactg 
taaaagcgca 
ccgctgagca 
atacggcgat 
cggccaatgc 
gtccggatgc 
tcattcagat 
cgcacagcct 
agatttccga 
atcagactca 
tggccggtat 
tgatccaaag 
gctaa 



aaatctgctc 
atctggcgag 
caccattgcc 
cggcacgacg 
ggcctacaat 
cagcgcaacc 
ttccagcagt 
cgtcaccgta 
tcagaccatc 
cgctaccggt 
ggattacatc 
cgcgtgggac 
tatcggcttt 
gagctcgagc 
aggtaacatc 
tacgccggaa 
gggtccgctg 

cactctggtg 
gcgcgcggaa 
gatcgatgtg 
atttggaagc 
cgtgctgatc 
gattcgccca 
gtattcccca 
gttgggcttg 
gttacagtac 
taccctttgg 
caatggtcag 



tttgcagcag 
gccacgctgg 
aacaacggca 
atcaataatc 
gcgaatgtgg 
cccacgttgg 
tctgctattt 
cgcatgagcg 
gagacctgga 
gagttgcgtg 
atggtcaacg 
ggagagtgtg 
ggcaatccat 
agcagctcta 
accaccagcg 
aacgaaggta 
gatgcaattt 
tggggcagtc 
attgaagagt 
gtgaatgaag 
gattggatca 
tacaacgact 
gccgtgaact 
caagtctgga 
ccgctgtaca 
atgcagatgc 
ggttatgttg 
cagcgcccgg 



tcttactgac 
atgttaataa 
gccaagccat 
tttggaacgg 
cgccgggtgc 
cgacgtttga 
caagcagctc 
gcgtaaccgg 
ccctgagcgt 
ttgcgtttac 
gcgtgaccta 
gtgcgggttc 
tcgatggcag 
accctaaccc 
gctcagttcg 
aatggggctc 
acaatttcgc 
agcaaccggg 
ggattcgcga 
ctctgccttc 
ctgaatcctt 
ataacttcat 
ccggttatgt 
ccgctcagca 
tctccgagta 
acttcccgat 
tgggtgctac 
ccatgcagtg 



agggcaagca 
cagctggggt 
tacttcctgg 
taatttgagt 
taataccagc 
agttcaaggc 
cagcagttcg 
agacgaaagc 
cggtatgctc 
caatgatgaa 
tcaggccgaa 
cttctcgcag 
caactcgtcg 
tggcaatccg 
ctctgacttc 
cgtagaggga 
ccgtgcgaac 
ttggattggc 
ttactgcgca 
gcacgctccg 
ccgtttggca 
gacctgggac 
cgatgcactg 
aatccagagc 
cgacatcgaa 
cttctacaac 
ctggcgtgat 
gttgatggag 



<210> 288 
<211> 564 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(23) 

<400> 288 

Met Thr lie His Leu Gin Lys Asn Leu Leu Phe Ala Ala Val Leu Leu 

1 5 10 15 

Thr Gly Gin Ala Ala His Ala Leu Thr ser Gly Ser Gly Glu Ala Thr 
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480 
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Leu Asp 

He Ala 
50 

Asn Leu 
65 

Gly val 

Ala Asn 

Leu Ala 

Ser Ser 
130 
Ser Asn 
145 

val ser 

val Gly 

Arg Val 

Tyr lie 
210 
Asn Thr 
225 

Met Leu 

Ser Asn 

ser Asn 

Asn lie 
290 
Asp Gin 
305 

Thr Arg 

Ala Arg 

Ser Gin 

Ala Glu 
370 
Thr Ala 
385 

Ala Asn 

Phe Arg 

Asp Tyr 

Arg pro 
450 
His ser 
465 

Lys Leu 

Tyr Asp 

Met His 

Leu Trp 
530 
lie Gin 
545 

Tyr Leu 



20 

val Asn Asn 
35 

Asn Asn Gly 

Asn Gly Thr 

Thr val Thr 
85 

Thr ser Phe 

100 
Thr Phe Glu 
115 

ser Ala lie 

Ser Val Thr 

Leu Glu lie 
165 

Met Leu Asp 

180 
Ala Phe Thr 
195 

Met Val Asn 

Gly Ala Trp 

His Cys Asn 
245 

ser Ser Ser 

260 
Pro Asn Pro 
275 

Thr Thr Ser 

lie Thr Pro 

Asp Gin Tyr 
325 

Ala Asn Gly 

340 
Gin Pro Gly 
355 

lie Glu Glu 

Met lie Asp 

Tyr Ala Ala 
405 

Leu Ala Arg 

420 
Asn Phe Met 
435 

Ala Val Asn 

Leu Tyr Ser 

Asp Gin lie 
485 

lie Glu Ala 

500 
Phe Pro He 
515 

Gly Tyr Val 
Ser Asn Gly 
Asn Arg 



25 

ser Trp Gly ser 
40 

ser Gin Ala lie 
55 

Thr lie Asn Asn 
70 

Pro Val Ala Tyr 

Gly Phe cys Ala 
105 

Val Gin Gly Gly 
120 

ser Ser Ser Ser 
135 

Val Arg Met Ser 
150 

Gly Gly Gin Thr 

Tyr Thr Val Gin 
185 

Asn Asp Glu Gly 
200 

Gly val Thr Tyr 
215 

Asp Gly Glu cys 
230 

Gly ser lie Gly 

Ser ser Ser Asn 
265 

Gly Asn Pro Asn 
280 

Gly Ser Val Arg 
295 

Glu Asn Glu Gly 
310 

Asn Trp Gly Pro 

lie Pro val Lys 
345 

Trp lie Gly Gly 
360 

Trp lie Arg Asp 
375 

val Val Asn Glu 
390 

Asn Ala Phe Gly 

Gin Tyr Cys Pro 
425 

Thr Trp Asp Thr 
440 

Ser Gly Tyr Val 
455 

pro Gin val Trp 
470 

Ser Glu Leu Gly 

Thr Asn Asp Gin 
505 

Phe Tyr Asn His 
520 

val Gly Ala Thr 
535 

Gin Gin Arg Pro 
550 



30 

Gly Tyr Cys Ala Asn val Thr 
45 

Thr Ser Trp Thr val Gly Leu 
60 

Leu Trp Asn Gly Asn Leu Ser 

75 80 
Asn Ala Asn Val Ala Pro Gly 
90 95 
Asn Gly Ser Ala Thr Pro Thr 
110 

Gly Gly Ala Ala Ser Ser Ser 
125 

Ser ser Ser Val Ala Gly Gly 
140 

Gly Val Thr Gly Asp Glu Ser 
155 160 
lie Glu Thr Trp Thr Leu Ser 
170 175 
Thr Asn Ala Thr Gly Glu Leu 
190 

Asp Arg Asp Val Glu Val Asp 
205 

Gin Ala Glu Asp Gin Glu Asp 
220 

Gly Ala Gly Ser Phe ser Gin 
235 240 
Phe Gly Asn Pro Phe Asp Gly 
250 255 
Thr Ser Ser Ser ser Ser Ser 
270 

Phe Pro Asp Phe Phe Val Gly 
285 

Ser Asp Phe Met Gin Tyr Trp 
300 

Lys Trp Gly Ser val Glu Gly 
315 320 
Leu Asp Ala lie Tyr Asn Phe 
330 335 
Ala His Thr Leu val Trp Gly 
350 

Leu ser Ala Ala Glu Gin Arg 

365 

Tyr Cys Ala Arg Tyr Pro Asp 
380 

Ala Leu Pro Ser His Ala Pro 
395 400 
ser Asp Trp lie Thr Glu Ser 
410 415 
Asp Ala val Leu lie Tyr Asn 
430 

Asp Ala lie lie Gin Met He 
445 

Asp Ala Leu Gly Leu Gin Ala 
460 

Thr Ala Gin Gin lie Gin Ser 
475 480 
Leu Pro Leu Tyr lie Ser Glu 
490 495 
Thr Gin Leu Gin Tyr Met Gin 
510 

pro Asn val Ala Gly He Thr 
525 

Trp Arg Asp Gly Thr Gly Leu 
540 

Ala Met Gin Trp Leu Met Glu 
555 560 
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<210> 289 
<211> 2796 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 289 

atgaagttca ctttgacacc gctgctgtgc gggttcgcct tattgttggg ttgcgcggtg 60 

caggcaaccc cagccgcttc gttaaagcag gcctatcagc cgtttttcca tatcggcacc 120 

gcagtcagtc tggcgcaatt acaaccatcc aaagaacatg aacgcgcttt aattgcgcag 180 

cactttaaca gtctgaccgc cgaaaacctg atgaaatggg aggaaattca acccacggaa 240 

ggcaactttg attttaaagc ggccgatcag ttggttgcgt ttgccgaaca acatcaaatg 300 

tggatgatcg gccataccat tctgtggcat gaacaaaccc cagactgggt gtttcagggg 360 

ctggatggca aacccgccag caagcagctg ctactggccc gcttgaccaa acatatccaa 420 

acggtcgttg gccgttacca gggccgggtc aatggctggg atgtggtgaa tgaagcgctc 480 

aatgaagatg gcagcctgcg cgataccccg tggcggcgca ttttgggtga tgattacatt 540 

gccaccactt ttgcgttggt gcatcaggtc gaccctaaag ccaaactcta ttacaacgat 600 

tacaacctgt ttaaaccgga aaaacgcgcc ggggtgctgc ggattatcca acaactgcag 660 

caaaaaaatg tgcctattca tgccattggt gaacaagcgc attacggcct tgattcgcct 720 

gcattcaaag acgttgaaga ttcgatcaac gcttttgctg ccaccggcct ggacgtgatg 780 

ctaaccgaac tggagatttc agtattgccg tatccatccg gcatgacgca gggtgccgat 840 

atcagtcagc atcaggaatt gcaggaacaa ctaaacccct atcgcgatgg tttgcccaaa 900 

gccgtcgaac aagcctggca acaacggtat ctcgatttgt tttcgctgtt attacgccag 960 

cagcaaaaac tgcatcgcgt gaccttctgg ggcttagatg atggccaaag ctggcgtaat 1020 

aatttcccga tgcgcggtcg caccgattac ccactgctgt ttgatcgcaa gctgcaagcc 1080 

aaaccgctgt taagcgcact gacggcatta gccgcagacc agactaaagc caagcccaaa 1140 

atgaatcagc tgggctttgc gccgacttcg accaaactgt tgattgtgcc gggtcggcaa 1200 

tcagtgcctt ttcatgtttt ggataccgag accggccaaa cggtgctgca aggccaaagt 1260 

tcggcggcca ggttttggcc tgaatcgggg gaatgggtca gtgctgccga tttttctgcg 1320 

gtgataactc ccggcaccta tcagatcaac atctcaggaa cgccgccaca aactgtcaag 1380 

atccaggccg aaccctatgc cgcgctgcat gatgcggcaa tcaaagccta ttattttaac 1440 

cgcgcctcgc tcacactgga gccaaagttt gccggacctt gggcacgcgc agcggggcat 1500 

ccggatacca aagtacgggt gcatgcttct gctgcatcgg ccagcaggcc agaaggttat 1560 

gagctcagcg ctgccaaagg ctggtatgac gccggtgact acaacaaata cgtggtgaat 1620 

tccggcatta ccagttacac cctgttgcag gcctggcagg attttcctga gttttatcaa 1680 

agccggacct ggaatattcc ggagtccggc aacgcggtac cggacattct cgacgaaacc 1740 

ttatggaatc tgcagtggtt cagcgccatg caagacccaa acgacggggg cgtctatcac 1800 

aagctgactg aactgaattt ttcggcaacc caaatgccgg accaagtgac agcagagcgt 1860 

tatgtggtgc aaaaaaccac cgccgcggca ctgaatttcg ctgcggtgtt ggccaaagcc 1920 

agtacggttt ttgccaaatt tgacgcccag ttgcccggcc tgtcgcaaca ataccgtcag 1980 

caagcactgc tcgcctggca atgggcgcaa aaaaatccgc agcaaatcta tcaacaaccc 2040 

aaagatgtcc acactggcgc ttatggtgac aaacaactgg ctgatgaatg ggcctgggct 2100 

ggcgccgagc tgtatttatt gaccggcgag caaagttatc tgcagccact gttggcgctg 2160 

gacacgccaa tcagtgcagc atcctgggcc agtgtcagcg ctttggggta tttttctttg 2220 

gcttcggcga aacagcttga gcccgcacta cggcaacagg tacaacagaa aatccaacaa 2280 

gccgccgcgc aaatcctgca ggaacatcaa acatccgcct atcaggtggc gatgaccaaa 2340 

aacgattttg tctggggcag taatgcggtg gcaatgaata aagcgatgtt gttataccag 2400 

gcgtggaaaa tagcgccaaa accggagctg ctacaggcga tgcaaggtct ggttgattac 2460 

gttttggggc gcaacccgtt gcagcagtct tatgtcacag ggtttggcga gcaaagcccg 2520 

cagcagatcc accaccgacc ttcggccgcc gatgccatca aagcgccggt accaggttgg 2580 

ttagtcggtg gtgcacagcc gggtaagcag gataaatgca cttatgccgg cgctttaccc 2640 

gctgtcggcg ctttacccgc tgccagcacc ttaccagcca ccacttatct tgatgactgg 2700 

tgcagttacg ccaccaacga agtggcgatt aactggaatg cacctttggt gtatgtgctg 2760 

gcatggcacc tttcgcaaaa caccaagaca ccataa 2796 

<210> 290 
<211> 931 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> CO... (22) 
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<400> 290 

Met Lys phe Thr Leu Thr Pro Leu Leu cys Gly Phe Ala Leu Leu Leu 

1 5 10 15 

Gly Cys Ala Val Gin Ala Thr Pro Ala Ala ser Leu Lys Gin Ala Tyr 

20 25 30 

Gin Pro phe Phe His lie Gly Thr Ala Val Ser Leu Ala Gin Leu Gin 

35 40 45 

Pro ser Lys Glu His Glu Arg Ala Leu He Ala Gin His Phe Asn Ser 

50 55 60 

Leu Thr Ala Glu Asn Leu Met Lys Trp Glu Glu lie Gin pro Thr Glu 
65 70 75 80 

Gly Asn Phe Asp Phe Lys Ala Ala Asp Gin Leu val Ala Phe Ala Glu 

85 90 95 

Gin His Gin Met Trp Met He Gly His Thr lie Leu Trp His Glu Gin 

100 105 110 

Thr Pro Asp Trp val Phe Gin Gly Leu Asp Gly Lys Pro Ala Ser Lys 

115 120 125 

Gin Leu Leu Leu Ala Arg Leu Thr Lys His lie Gin Thr val Val Gly 

130 135 140 

Arg Tyr Gin Gly Arg val Asn Gly Trp Asp Val val Asn Glu Ala Leu 
145 150 155 160 

Asn Glu Asp Gly Ser Leu Arg Asp Thr Pro Trp Arg Arg lie Leu Gly 

165 170 " 175 

Asp Asp Tyr lie Ala Thr Thr Phe Ala Leu Val His Gin Val Asp Pro 

180 185 190 

Lys Ala Lys Leu Tyr Tyr Asn Asp Tyr Asn Leu Phe Lys pro Glu Lys 

195 200 205 

Arg Ala Gly Val Leu Arg lie lie Gin Gin Leu Gin Gin Lys Asn val 

210 215 220 

pro lie His Ala lie Gly Glu Gin Ala His Tyr Gly Leu Asp Ser Pro 
225 230 235 240 

Ala Phe Lys Asp Val Glu Asp ser lie Asn Ala Phe Ala Ala Thr Gly 

245 250 255 

Leu Asp Val Met Leu Thr Glu Leu Glu lie ser Val Leu pro Tyr Pro 

260 265 270 

Ser Gly Met Thr Gin Gly Ala Asp lie Ser Gin His Gin Glu Leu Gin 

275 280 285 

Glu Gin Leu Asn Pro Tyr Arg Asp Gly Leu Pro Lys Ala val Glu Gin 

290 295 300 

Ala Trp Gin Gin Arg Tyr Leu Asp Leu Phe ser Leu Leu Leu Arg Gin 
305 310 315 320 

Gin Gin Lys Leu His Arg Val Thr Phe Trp Gly Leu Asp Asp Gly Gin 

325 330 335 

Ser Trp Arg Asn Asn Phe Pro Met Arg Gly Arg Thr Asp Tyr Pro Leu 

340 345 350 

Leu Phe Asp Arg Lys Leu Gin Ala Lys pro Leu Leu Ser Ala Leu Thr 

355 360 365 

Ala Leu Ala Ala Asp Gin Thr Lys Ala Lys Pro Lys Met Asn Gin Leu 

370 375 380 

Gly Phe Ala Pro Thr Ser Thr Lys Leu Leu lie Val Pro Gly Arq Gin 
385 390 395 400 

Ser Val Pro Phe His Val Leu Asp Thr Glu Thr Gly Gin Thr Val Leu 

405 410 415 

Gin Gly Gin ser ser Ala Ala Arg Phe Trp Pro Glu ser Gly Glu Trp 

420 425 430 

Val ser Ala Ala Asp phe Ser Ala val lie Thr Pro Gly Thr Tyr Gin 

435 440 445 

lie Asn He Ser Gly Thr Pro Pro Gin Thr Val Lys lie Gin Ala Glu 

450 455 460 

pro Tyr Ala Ala Leu His Asp Ala Ala He Lys Ala Tyr Tyr Phe Asn 
465 470 475 480 

Arg Ala ser Leu Thr Leu Glu Pro Lys Phe Ala Gly Pro Trp Ala Arg 

485 490 495 

Ala Ala Gly His Pro Asp Thr Lys val Arg val His Ala ser Ala Ala 

500 505 510 

Ser Ala Ser Arg Pro Glu Gly Tyr Glu Leu Ser Ala Ala Lys Glv Tro 

515 520 525 

Tyr Asp Ala Gly Asp Tyr Asn Lys Tyr Val val Asn ser Gly He Thr 
530 535 540 

Page 213 



WO 03/106654 



PCT/US03/19153 



Ser Tyr Thr Leu Leu Gin Ala Trp Gin Asp Phe Pro Glu Phe Tyr Gin 
545 550 555 560 

ser Arg Thr Trp Asn lie Pro Glu Ser Gly Asn Ala Val Pro Asp lie 

, , 565 570 575 

Leu Asp Glu Thr Leu Trp Asn Leu Gin Trp Phe Ser Ala Met Gin Asp 

580 585 590 

Pro Asn Asp Gly Gly Val Tyr His Lys Leu Thr Glu Leu Asn Phe Ser 

595 600 605 

Ala Thr Gin Met Pro Asp Gin Val Thr Ala Glu Arg Tyr val Val Gin 

610 615 620 

Lys Thr Thr Ala Ala Ala Leu Asn Phe Ala Ala Val Leu Ala Lys Ala 
625 630 635 640 

ser Thr val phe Ala Lys Phe Asp Ala Gin Leu Pro Gly Leu ser Gin 

645 650 655 

Gin Tyr Arg Gin Gin Ala Leu Leu Ala Trp Gin Trp Ala Gin Lys Asn 

660 665 670 

Pro Gin Gin lie Tyr Gin Gin Pro Lys Asp val His Thr Gly Ala Tyr 

675 680 685 

Gly Asp Lys Gin Leu Ala Asp Glu Trp Ala Trp Ala Gly Ala Glu Leu 

690 695 700 

Tyr Leu Leu Thr Gly Glu Gin Ser Tyr Leu Gin Pro Leu Leu Ala Leu 
705 710 715 720 

Asp Thr Pro He Ser Ala Ala ser Trp Ala ser Val ser Ala Leu Gly 

725 730 735 

Tyr Phe ser Leu Ala Ser Ala Lys Gin Leu Glu Pro Ala Leu Arq Gin 

740 745 750 

Gin val Gin Gin Lys lie Gin Gin Ala Ala Ala Gin lie Leu Gin Glu 

755 760 765 

His Gin Thr ser Ala Tyr Gin val Ala Met Thr Lys Asn Asp phe Val 

770 775 780 

Trp Gly ser Asn Ala Val Ala Met Asn Lys Ala Met Leu Leu Tyr Gin 
785 790 795 800 

Ala Trp Lys lie Ala Pro Lys Pro Glu Leu Leu Gin Ala Met Gin Gly 

805 810 815 

Leu val Asp Tyr val Leu Gly Arg Asn pro Leu Gin Gin ser Tyr val 

820 825 830 

Thr Gly phe Gly Glu Gin ser Pro Gin Gin lie His His Arg Pro Ser 

835 840 845 

Ala Ala Asp Ala lie Lys Ala Pro Val Pro Gly Trp Leu Val Gly Gly 

850 855 860 

Ala Gin Pro Gly Lys Gin Asp Lys Cys Thr Tyr Ala Gly Ala Leu Pro 
865 870 875 880 

Ala val Gly Ala Leu Pro Ala Ala Ser Thr Leu Pro Ala Thr Thr Tyr 

885 890 895 

Leu Asp Asp Trp cys ser Tyr Ala Thr Asn Glu val Ala lie Asn Trp 

900 905 910 

Asn Ala Pro Leu Val Tyr Val Leu Ala Trp His Leu Ser Gin Asn Thr 

915 920 925 

Lys Thr Pro 
930 

<210> 291 

<211> 1230 

<212> DNA 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 291 

atggtcaaag aaagaagttt tcttcatcat tcattcaata ggggcgaaaa tggacaggac 60 

agtctgatgt ggaaaaaaga ggcggatgat cgaatctcag agcatagaca aagagatctt 120 

gtgatcaacg taacaaacgg tgaaaaaaag ccaatagcag gtatagaggt tgaaataaag 180 

caaatcagac atgaattcgc ctttggttca gcgatgaatg atcaagtgtt atttaatcaa 240 

caatatgctg attttttcgt gaagtatttt aattgggctg tttttgaaaa tgaggcaaaa 300 

tggtatgcga atgagccaca aagagggaga atcacctacg aaaaagcaga tgcgatgctg 360 

aattttgcag atcgacatca gcttccagtg agagggcacg ctttgttttg ggaggtagag 420 

gatgcgaatc caagctggct aaggtcactg ccaaatcatg aagtatatga agccatgaaa 480 

aaccggcttg agcatgcggg caatcacttt aagggaaggt tccgtcattg ggatgtaaac 540 
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aatgaaatga tgcatggttc attttttaaa gatcgctttg ggaaaaatat ttggaagtgg 600 

atgtatgaag aaacgaaaaa aattgaccct caagcactat tgtttgtgaa tgattataat 660 

gtgatctcat atggtgaaca ccatgcctat aaagcgcata tcaatgaact gcgtcagtta 720 

ggcgcaccta ttgaggcgat tggggttcaa ggccattttg aagaacgggt cgatccagtc 780 

attgtcaaag agagactcga tgtgcttgct gagctaggtc ttccaatatg ggtcacagag 840 

tacgattcgg ttcaccctga ccctaatcga agagcggata acctggaagc tttatatcgc 900 

gtcgcattta gtcatccagc cgtaaaagga gtgctgatgt ggggattttg ggcaggtgcc 960 

cattggagag gggaaaatgc agccatcgtg aattatgatt ggtctttaaa tgaagcagga 1020 

agacgttatg aaaagcttct aaatgagtgg acgacccaaa gaattgaaaa aacagatgct 1080 

aatggccatg tgagatgtcc agcatttcac ggaacatatg aggttcgaat cggtaaagaa 1140 

agtaaaatgt tgaaacagca gacgattgaa cttgattcaa atgaacaaac accgtttcaa 1200 

ctagacgtga tcctgcctca agaaggatag 1230 

<210> 292 
<211> 409 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 292 

Met val Lys Glu Arg Ser Phe Leu His His Ser Phe Asn Arg Gly Glu 

1 5 10 15 

Asn Gly Gin Asp Ser Leu Met Trp Lys Lys Glu Ala Asp Asp Arg He 

20 25 30 

Ser Glu His Arg Gin Arg Asp Leu val lie Asn val Thr Asn Gly Glu 

35 40 45 

Lys Lys Pro lie Ala Gly lie Glu val Glu lie Lys Gin lie Arg His 

50 55 60 

Glu Phe Ala Phe Gly Ser Ala Met Asn Asp Gin Val Leu Phe Asn Gin 
65 70 75 80 

Gin Tyr Ala Asp Phe Phe Val Lys Tyr Phe Asn Trp Ala val Phe Glu 

85 90 95 

Asn Glu Ala Lys Trp Tyr Ala Asn Glu Pro Gin Arg Gly Arg lie Thr 

100 105 110 

Tyr Glu Lys Ala Asp Ala Met Leu Asn Phe Ala Asp Arg His Gin Leu 

115 120 125 

Pro Val Arg Gly His Ala Leu Phe Trp Glu Val Glu Asp Ala Asn Pro 

130 135 140 

Ser Trp Leu Arg ser Leu Pro Asn His Glu Val Tyr Glu Ala Met Lys 
145 150 155 160 

Asn Arg Leu Glu His Ala Gly Asn His Phe Lys Gly Arg Phe Arg His 

165 170 175 

Trp Asp val Asn Asn Glu Met Met His Gly Ser Phe Phe Lys Asp Arg 

180 185 190 

Phe Gly Lys Asn lie Trp Lys Trp Met Tyr Glu Glu Thr Lys Lys lie 

195 200 205 

Asp Pro Gin Ala Leu Leu Phe Val Asn Asp Tyr Asn val lie Ser Tyr 

210 215 220 

Gly Glu His His Ala Tyr Lys Ala His lie Asn Glu Leu Arg Gin Leu 
225 230 235 240 

Gly Ala Pro lie Glu Ala He Gly val Gin Gly His Phe Glu Glu Arg 

245 250 255 

val Asp Pro val lie val Lys Glu Arg Leu Asp val Leu Ala Glu Leu 

260 265 270 

Gly Leu Pro He Trp Val Thr Glu Tyr Asp Ser val His Pro Asp Pro 

275 280 285 

Asn Arg Arg Ala Asp Asn Leu Glu Ala Leu Tyr Arg val Ala Phe ser 

290 295 300 

His Pro Ala val Lys Gly val Leu Met Trp Gly Phe Trp Ala Gly Ala 
305 310 315 320 

His Trp Arg Gly Glu Asn Ala Ala lie Val Asn Tyr Asp Trp Ser Leu 

325 330 335 

Asn Glu Ala Gly Arg Arg Tyr Glu Lys Leu Leu Asn Glu Trp Thr Thr 

340 345 350 

Gin Arg lie Glu Lys Thr Asp Ala Asn Gly His val Arg cys Pro Ala 

355 360 365 

Phe His Gly Thr Tyr Glu Val Arg lie Gly Lys Glu Ser Lys Met Leu 
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370 375 380 

Lys Gin Gin Thr He Glu Leu Asp ser Asn Glu Gin Thr Pro Phe Gin 
385 390 395 400 

Leu Asp val lie Leu Pro Gin Glu Gly 
405 

<210> 293 
<211> 1002 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 293 

atgaagatga acagctccct cccctccctc cgcgatgtat tcgcgaatga tttccgcatc 60 

ggggcggcgg tcaatcctgt gacgatcgag atgcaaaaac agttgttgat cgatcatgtc 120 

aacagtatta cggcagagaa ccatatgaag tttgagcatc ttcagccgga agaagggaaa 180 

tttacctttc aggaagcgga tcggattgtg gattttgctt gttcgcaccg aatggcggtt 240 

cgagggcaca cacttgtatg gcacaaccag actccggatt gggtgtttca agatggtcaa 300 

ggccatttcg tcagtcggga tgtgttgctt gagcggatga aatgtcacat ttcaactgtt 360 

gtacggcgat acaagggaaa aatatattgt tgggatgtca tcaacgaagc ggtagccgac 420 

gaaggagacg aattgttgag gccgtcgaag tggcgacaaa tcatcgggga cgattttatg 480 

gaacaagcat ttctctacgc ttatgaagct gacccagatg cactgctttt ttacaatgac 540 

tataatgaat gttttccgga aaagagagaa aaaatttttg cacttgtcaa atcgctgcgt 600 

gataaaggca ttccgattca tggcatcggg atgcaagcgc attggagttt gactcgcccg 660 

tcgcttgatg aaattcgtgc ggccattgaa cgatatgcgt cccttggtgt tgttcttcat 720 

attacggaac tcgatgtatc catgtttgaa tttcacgatc gtcgaaccga tttggcagct 780 

ccaacgtcag aaatgatcga acggcaggca gagcggtatg ggcaaatttt tgctttgttt 840 

aaggagtatc gcgatgttat tcaaagtgtc acattttggg gaattgctga tgaccataca 900 

tggctcgata actttccagt gcacgggaga aaaaactggc cgcttttgtt cgatgaacag 960 

cataaaccga aaccagcttt ttggcgggca gtgagtgtct ga 1002 

<210> 294 

<211> 333 

<212> PRT 

<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 294 

Met Lys Met Asn Ser Ser Leu Pro Ser Leu Arg Asp Val Phe Ala Asn 

15 10 15 

Asp Phe Arg He Gly Ala Ala val Asn Pro Val Thr lie Glu Met Gin 

20 25 30 

Lys Gin Leu Leu lie Asp His Val Asn ser lie Thr Ala Glu Asn His 

35 40 45 

Met Lys Phe Glu His Leu Gin Pro Glu Glu Gly Lys Phe Thr Phe Gin 

50 55 60 

Glu Ala Asp Arg lie val Asp Phe Ala Cys Ser His Arg Met Ala Val 
65 70 75 80 

Arg Gly His Thr Leu val Trp His Asn Gin Thr Pro Asp Trp val Phe 

85 90 95 

Gin Asp Gly Gin Gly His Phe Val Ser Arg Asp Val Leu Leu Glu Arg 

100 105 110 

Met Lys cys His lie ser Thr Val Val Arg Arg Tyr Lys Gly Lys He 

115 120 125 

Tyr Cys Trp Asp Val lie Asn Glu Ala val Ala Asp Glu Gly Asp Glu 

130 135 140 

Leu Leu Arg Pro ser Lys Trp Arg Gin lie lie Gly Asp Asp Phe Met 
145 ~ 150 155 160 

Glu Gin Ala Phe Leu Tyr Ala Tyr Glu Ala Asp Pro Asp Ala Leu Leu 

165 170 175 

Phe Tyr Asn Asp Tyr Asn Glu Cys Phe Pro Glu Lys Arg Glu Lys He 

180 185 190 

Phe Ala Leu Val Lys ser Leu Arg Asp Lys Gly lie Pro lie His Gly 

195 200 205 

lie Gly Met Gin Ala His Trp ser Leu Thr Arg Pro Ser Leu Asp Glu 
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210 215 220 

lie Arq Ala Ala lie Glu Arg Tyr Ala ser Leu Gly Val Val Leu His 
225 230 235 240 

lie Thr Glu Leu Asp Val ser Met Phe Glu Phe His Asp Arg Arg Thr 

245 250 255 

Asp Leu Ala Ala Pro Thr Ser Glu Met lie Glu Arg Gin Ala Glu Arg 

260 265 270 

Tyr Gly Gin lie Phe Ala Leu Phe Lys Glu Tyr Arg Asp Val lie Gin 

275 280 285 

Ser Val Thr Phe Trp Gly lie Ala Asp Asp His Thr Trp Leu Asp Asn 

290 295 300 

Phe Pro val His Gly Arg Lys Asn Trp Pro Leu Leu Phe Asp Glu Gin 
305 310 n 315 320 

His Lys Pro Lys Pro Ala Phe Trp Arg Ala Val Ser val 
325 330 



<210> 295 
<211> 1134 
<212> DNA 
<213> Unknown 

< 220 > 

<223> Obtained from an environmental sample. 



<400> 295 

atgagatccg 

acgtcgacgg 

aagttcgacc 

ggcggcggcc 

cggaagatcc 

ttcatccacc 

gcccagcgga 

gaatggctgg 

atcgacacgg 

atcttcaacg 

ccggagatcg 

ttcctcaacg 

gcccaggaga 

agcacccgct 

ggtctggaga 

cccaccaagg 

ctggccgtga 

gtgccggtct 

aagccggcct 



tccgcatcgt 
ccacggccaa 
gcctgcgctg 
accacctcga 
tgggccagca 
ccgagcgcga 
accgccaggc 
aggagggcga 
tcgtcggccg 
accaggccga 
tcgcggacgc 
actacaacgt 
tgctggagca 
acggcttccc 
ccgccatcac 
agcagctgcg 
acgactgcaa 
tcttcgaggg 
tcttcgccct 



cacctttgct 
gccgtccgcc 
ggccgccccc 
acaggactac 
gttcaactcg 
ccagtaccgc 
cgtgcgcggg 
cttcaccaag 
ctacgccggc 
gctgcgcacc 
cttccgctgg 
cgagggcatc 

gggcgtgccg 
gggcgacctg 
cgagatcgac 
gcagcaggcc 
ctccttcacc 
tgagggcagc 
gcagtccacc 



ctcgccgccg 
gaccacgagg 
gaagggttct 
ccggacccct 
gtctccgccg 
ttcgaggagg 
cacaccctcc 
gaggaactgc 
aagatccagc 
gacgagaaca 
gcccacgagg 
aacgccaaga 
ctccacggat 
cagcagaacc 
gtccgcatgg 
gactactacc 
atctggggct 
gccacggtca 
ctgaaggagg 



cgctggccgt 
ccgcgcccca 
tcataggctc 
tcaccttcga 
agaaccagat 
ccgacgccat 
tgtggcacag 
gcgccatcct 
agtgggacgt 
tctggatacg 
ccgaccccga 
gcgacgccta 
tcggcgccca 
tgcagcggtt 
acctcccggc 
agcaggcact 
tcaccgacaa 
tgacggagaa 
cgcgcaagcg 



cccgctggtg 
ctccaacgcc 
cgcggcggcc 
caagaagtac 
gaagtgggag 
cgtcgagttc 
ccagaacccc 
caaggaccac 
ggccaacgag 
tgagctcggc 
ggccaagctg 
ctacgagctc 
gggccacctg 
cgccgacctc 
gagcggcaag 
gtcggcctgc 
gtactcgtgg 
gttcgtccgc 
ctga 



<210> 296 
<211> 377 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from 

<221> SIGNAL 
<222> (1)...(26) 

<400> 296 

Met Arg Ser val Arg 

val Pro Leu val Thr 
20 

Glu Ala Ala Pro His 
35 

Ala Pro Glu Gly Phe 
50 

His Leu Glu Gin Asp 
65 

Arg Lys He Leu Gly 
85 



an environmental sample. 



lie val Thr Phe Ala Leu Ala Ala Ala Leu Ala 

10 15 
Ser Thr Ala Thr Ala Lys Pro ser Ala Asp His 

25 30 
ser Asn Ala Lys Phe Asp Arg Leu Arg Trp Ala 

40 45 
Phe lie Gly Ser Ala Ala Ala Gly Gly Gly His 

55 60 
Tyr Pro Asp Pro Phe Thr Phe Asp Lys Lys Tyr 
70 75 80 

Gin Gin Phe Asn ser val Ser Ala Glu Asn Gin 
90 95 
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Met Lys Trp Glu Phe lie 
100 

Glu Ala Asp Ala lie val 
115 

Arg Gly His Thr Leu Leu 
130 

Glu Gly Asp Phe Thr Lys 
145 150 
He Asp Thr Val Val Gly 
165 

val Ala Asn Glu He Phe 
180 

Asn lie Trp He Arg Glu 
195 

Arg Trp Ala His Glu Ala 
210 

Tyr Asn val Glu Gly lie 
225 2B0 
Ala Gin Glu Met Leu Glu 
245 

Gin Gly His Leu Ser Thr 
260 

Asn Leu Gin Arg Phe Ala 
275 

He Asp val Arg Met Asp 
290 

Gin Leu Arg Gin Gin Ala 
305 ~ 310 

Leu Ala val Asn Asp Cys 
325 

Lys Tyr ser Trp Val Pro 
340 

val Met Thr Glu Lys Phe 
355 

Ser Thr Leu Lys Glu Ala 
370 

<210> 297 
<211> 1842 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 297 

ttgaggtcag gcgcgttctg tttcatcata gtcgttttaa tcctgaacct tatatgcagg 60 

gagttgtatg agtgtaaaaa agttgttacc gcagcactgg tatgcttggc tttcgggtca 120 

tcgctgactt gggggcaatg caccacattt accaccagta ccattcggaa ttgcgatggt 180 

atagattacg agctctggag ccagaataac tctggcacga ccaatatgca aatcacggga 240 

gggaactcga atccaaacgg tggaaccttt gaggcgacat ggagtggcac gatcaatgtt 300 

ctattccgcg cgggtaaaaa atggggcaca tccagcacca gtacccccaa aaccatcggc 360 

aatatctctc ttgaattcgc agcgacatgg agttcggtcg ataatgtgaa aatgcttggc 420 

atctatggct gggcgtatta tccctcggga agcgaaccaa caaaaacgga aagcggtcaa 480 

agcacgaact tttccaatca gattgagtat tacatcattc aagaccgcgg tagctataac 540 

ccggcatccg gcggcaccaa cgccaaaaag tacggtgaag ggacgatcga cggaatcgcg 600 

tatgattttt atgtcgccga ccgtatcggc caggccatgc tgacaggaac gggaaatttc 660 

aaacagtact tcagcgtgcc gaagagcaca agcagtcaca ggcaaagcgg cacggtttcc 720 

gtctccaaac attttgaggc ctgggaaaaa gcgggcatga agatgatgga ttgtcggtta 780 

tacgaagtcg cgatgaaagt ggaatcgtat accggttccg cgaatggcaa cggctcggcg 840 

aaagtgacca aaaatctcct cacgatcggc ggaagcagca gcaacgagtt tagtctcgta 900 

acgaatgttt ctcctgccag cgcgggaacg gtgtccaaga gcccggacaa cgcatcctat 960 

gccccgaacg cctccgttca gctcacggcg accccgaata ccggttggaa gtttgtgggc 1020 

tgggaagggg acgcctcggg ttccacgagc ccaaccagcg ttaccatgag caaagacctc 1080 

acggttacag cgaagtttga gctggtatcg gaagaaggca gcacaaacct gatccaggat 1140 

ggcaacttcc cgagcggcag cgtaatctct acagatgacg gggcttcatg gaaactcgga 1200 

caaggggaaa actggggaaa ttccgcagcc acaacgagcg tcagcaatgg aatcgcgaca 1260 

gtcaatgtga caactgtcgg agcggaagct tatcaaccgc agcttgtaca gtacggtttg 1320 

ggactcgaca tggacatgag ttacaaactt accttcaagg caagagccga tgcggcaagg 1380 
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His Pro Glu 
105 

Glu Phe Ala 

120 
Trp His ser 
135 

Glu Glu Leu 

Arg Tyr Ala 

Asn Asp Gin 
185 

Leu Gly Pro 

200 
Asp Pro Glu 
215 

Asn Ala Lys 

Gin Gly val 

Arg Tyr Gly 
265 

Asp Leu Gly 

280 
Leu Pro Ala 
295 

Asp Tyr Tyr 

Asn Ser Phe 

val Phe Phe 
345 

val Arg Lys 

360 
Arc; Lys Arg 



Arg Asp 

Gin Arg 

Gin Asn 

Arg Ala 
155 
Gly Lys 
170 

Ala Glu 

Glu lie 

Ala Lys 

ser Asp 
235 
Pro Leu 
250 

Phe pro 

Leu Glu 

ser Gly 

Gin Gin 
315 
Thr lie 
330 

Glu Gly 
Pro Ala 



Gin Tyr Arg 
110 

Asn Arg Gin 

125 
Pro Glu Trp 
140 

lie Leu Lys 

He Gin Gin 

Leu Arg Thr 
190 

val Ala Asp 

205 
Leu Phe Leu 
220 

Ala Tyr Tyr 

His Gly Phe 

Gly Asp Leu 
270 

Thr Ala He 

285 
Lys Pro Thr 
300 

Ala Leu ser 

Trp Gly Phe 

Glu Gly ser 
350 

Phe Phe Ala 
365 



Phe Glu 

Ala Val 

Leu Glu 

Asp His 
160 
Trp Asp 
175 

Asp Glu 

Ala Phe 

Asn Asp 

Glu Leu 
240 
Gly Ala 
255 

Gin Gin 

Thr Glu 

Lys Glu 

Ala cys 
320 
Thr Asp 
335 

Ala Thr 
Leu Gin 
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aagattgaag ttgcgttcca gcaggcggtg gatccttggg ctggttatgc ttcccaggaa 1440 

ttcgacctga ccacgaccga tcaggatttc gagttcgtat tcacgatgac caacgccagc 1500 

gacccggcat cacagttcgc gttcaatctt ggccaggcga caggcgatgt ctatatcagt 1560 

gatgttaaac tggtatacac gacaggcacc acacccatat cccgcaccat agtccgcggc 1620 

aatacggcat tcgtctcggt aagtggcaga accctgaata tttcggcagt cgacgcgtcc 1680 

acacttcaga tcaaggtagt agatataaac ggaaaggtaa gagcgaattt caacacggct 1740 

ggtgcagcaa gtgtttcctt gtccaatatt cctgcgggcc agtacttcgt tggtatcaca 1800 

ggcacaggca taaaacaaat ctcaccgatc gttttggaat aa ~~ 1842 

<210> 298 

<211> 613 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 298 

Met Arg ser Gly Ala Phe cys Phe He lie val Val Leu lie Leu Asn 
1 5 10 15 

Leu lie cys Arg Glu Leu Tyr Glu cys Lys Lys val val Thr Ala Ala 

20 25 30 

Leu Val cys Leu Ala Phe Gly Ser Ser Leu Thr Trp Gly Gin cys Thr 

35 40 45 

Thr Phe Thr Thr Ser Thr He Arg Asn cys Asp Gly lie Asp Tyr Glu 

50 55 60 

Leu Trp ser Gin Asn Asn ser Gly Thr Thr Asn Met Gin lie Thr Gly 
65 70 75 80 

Gly Asn Ser Asn Pro Asn Gly Gly Thr Phe Glu Ala Thr Trp ser Gly 

85 90 95 

Thr lie Asn Val Leu Phe Arg Ala Gly Lys Lys Trp Gly Thr ser Ser 

100 105 HO 

Thr ser Thr Pro Lys Thr lie Gly Asn lie Ser Leu Glu Phe Ala Ala 

115 120 125 

Thr Trp ser Ser Val Asp Asn Val Lys Met Leu Gly He Tyr Gly Trp 

130 135 140 

Ala Tyr Tyr Pro Ser Gly ser Glu Pro Thr Lys Thr Glu ser Gly Gin 
145 150 155 160 

Ser Thr Asn Phe Ser Asn Gin lie Glu Tyr Tyr lie lie Gin Asp Arg 

165 170 175 

Gly ser Tyr Asn Pro Ala ser Gly Gly Thr Asn Ala Lys Lys Tyr Gly 

180 185 190 

Glu Gly Thr lie Asp Gly lie Ala Tyr Asp Phe Tyr Val Ala Asp Arg 

195 200 205 

lie Gly Gin Ala Met Leu Thr Gly Thr Gly Asn Phe Lys Gin Tyr Phe 

210 215 220 

ser val Pro Lys Ser Thr ser Ser His Arg Gin Ser Gly Thr Val ser 
225 b 230 235 240 

val ser Lys His Phe Glu Ala Trp Glu Lys Ala Gly Met Lys Met Met 

245 250 255 

Asp cys Arg Leu Tyr Glu Val Ala Met Lys Val Glu ser Tyr Thr Gly 

260 265 270 

Ser Ala Asn Gly Asn Gly Ser Ala Lys val Thr Lys Asn Leu Leu Thr 

, , 275 280 285 

He Gly Gly ser Ser Ser Asn Glu Phe Ser Leu val Thr Asn Val ser 

290. 295 300 

pro Ala ser Ala Gly Thr val Ser Lys Ser Pro Asp Asn Ala Ser Tyr 
305 310 315 320 

Ala Pro Asn Ala ser val Gin Leu Thr Ala Thr Pro Asn Thr Gly Trp 

325 330 335 

Lys Phe Val Gly Trp Glu Gly Asp Ala ser Gly ser Thr Ser pro Thr 

, u 340 345 350 

Ser val Thr Met ser Lys Asp Leu Thr val Thr Ala Lys phe Glu Leu 

355 360 365 

val ser Glu Glu Gly ser Thr Asn Leu He Gin Asp Gly Asn Phe Pro 

370 375 380 

Ser Gly Ser Val lie ser Thr Asp Asp Gly Ala Ser Trp Lys Leu Gly 
385 390 395 400 

Gin Gly Glu Asn Trp Gly Asn Ser Ala Ala Thr Thr Ser Val Ser Asn 
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Gly lie Ala 

Pro Gin Leu 

435 

Lys Leu Thr 

450 
Ala Phe Gin 
465 

Phe Asp Leu 

Thr Asn Ala 

Ala Thr Gly 
515 

Gly Thr Thr 

530 
val ser Val 
545 

Thr Leu Gin 

Phe Asn Thr 

Gly Gin Tyr 
595 

Pro lie val 
610 



405 
Thr Val 
420 

Val Gin 

Phe Lys 

Gin Ala 

Thr Thr 
485 
ser Asp 
500 

Asp val 

Pro lie 

Ser Gly 

lie Lys 
565 
Ala Gly 
580 

Phe Val 
Leu Glu 



Asn val 

Tyr Gly 

Ala Arg 
455 
Val Asp 
470 

Thr Asp 

pro Ala 

Tyr lie 

Ser Arg 
535 
Arg Thr 
550 

Val Val 
Ala Ala 
Gly lie 



Thr Thr 
425 
Leu Gly 
440 

Ala Asp 

Pro Trp 

Gin Asp 

Ser Gin 
505 
Ser Asp 
520 

Thr lie 

Leu Asn 

Asp lie 

Ser Val 
585 
Thr Gly 
600 



410 

val Gly 

Leu Asp 

Ala Ala 

Ala Gly 
475 
Phe Glu 
490 

Phe Ala 

Val Lys 

val Arg 

lie ser 
555 
Asn Gly 
570 

Ser Leu 
Thr Gly 



Ala Glu 

Met Asp 
445 
Arg Lys 
460 

Tyr Ala 

Phe val 

Phe Asn 

Leu Val 
525 
Gly Asn 
540 

Ala val 

Lys Val 

Ser Asn 

lie Lys 
605 



415 
Ala Tyr Gin 
430 

Met Ser Tyr 

lie Glu Val 

Ser Gin Glu 
480 

Phe Thr Met 

495 
Leu Gly Gin 
510 

Tyr Thr Thr 

Thr Ala Phe 

Asp Ala Ser 
560 

Arg Ala Asn 
575 

lie Pro Ala 
590 

Gin He Ser 



<210> 299 
<211> 1047 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 



<400> 299 

atgtttttga 

gcggcccaag 

tattcgttct 

tacacctcca 

tcccgcagag 

ctctatgggt 

tatcgtccgc 

gacgtctatc 

tatcaatact 

cacttcgatg 

gcgaccgagg 

agcagtagca 

ttcacggtcc 

cagaacgtgc 

ttgagcggtg 

gattacatca 

ctctatgcca 

gccattggct 



gtctcaaaag 
cacagacctg 
ggaaagacaa 
actggagcgg 
tggtgaacta 
ggaccaccaa 
cgggtgggca 
gcacgcagcg 
ggagcgtgcg 
cctgggccag 
gctatcaaag 
gcggtggtgg 
gggcgcgcgg 
agacctggac 
gcatcaccgt 
tcgtgaacgg 
acggtagttg 
acgggaatac 



agtggcggcg 
cctgacctcg 
tcccggcacc 
catcaacaac 
ctcgggcagc 
tccgctcatc 
ggggttcatg 
cgtcaatcag 
gcagtcgaag 
ctacggaatg 
cagcggcagc 
cagcagcacg 
aaccgcgggt 
gctgggcacc 
ggcttacacg 
ctcgacgcgt 
tggtggcggc 
gccgtag 



cttgtttgcg 
agtcaaaccg 
gtgaatttct 
tgggtcggcg 
ttcaattcgc 
gagtactaca 
ggcacggtga 
ccctgcatca 
cggaccggtg 
aatctgggcg 
tctgacatca 
agcagcagtg 
ggtgagtcca 
agcatgacga 
aacgacagtg 
cagtcagaag 
tccaatagcg 



tcgccggtct 
gcactaacaa 
gtctgcagtc 
gaaagggatg 
ctggcaatgg 
ttgtcgacaa 
ccagcgatgg 
ccggcagcag 
gcacgatcac 
ctcacaacta 
cggtgagtga 
gcggcggcgg 
tcacgctgcg 
actacacggc 
gcaatcgcga 
cgcagagcta 
aatggatgca 



cggcatctct 
cggcttctac 
cggcggccgc 
gcagacgggt 
gtacctgact 
ctggggcacg 
cgcgacgtat 
ttgcacgttc 
caccggcaac 
ccagatcatg 
gggaagcagc 
caccaagagc 
tgtgaacaat 
atcgacgtcg 
cgtgcaggtg 
caacaccggg 
ttgcaacggc 



<210> 300 
<211> 348 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(24) 

<400> 300 

Met Phe Leu Ser Leu Lys Arg val Ala Ala Leu val cys val Ala Gly 
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15 10 15 

Leu Gly lie Ser Ala Ala Gin Ala Gin Thr Cys Leu Thr Ser ser Gin 

20 25 30 

Thr Gly Thr Asn Asn Gly Phe Tyr Tyr Ser Phe Trp Lys Asp Asn Pro 

35 40 45 

Gly Thr Val Asn Phe cys Leu Gin Ser Gly Gly Arg Tyr Thr ser Asn 

50 55 60 

Trp ser Gly lie Asn Asn Trp Val Gly Gly Lys Gly Trp Gin Thr Gly 
65 70 75 80 

Ser Arg Arg val Val Asn Tyr Ser Gly Ser Phe Asn ser Pro Gly Asn 

85 90 95 

Gly Tyr Leu Thr Leu Tyr Gly Trp Thr Thr Asn Pro Leu lie Glu Tyr 

100 105 110 

Tyr lie Val Asp Asn Trp Gly Thr Tyr Arg Pro Pro Gly Gly Gin Gly 

115 120 125 

Phe Met Gly Thr Val Thr Ser Asp Gly Ala Thr Tyr Asp val Tyr Arg 

130 135 140 

Thr Gin Arg val Asn Gin Pro cys lie Thr Gly Ser ser cys Thr Phe 
145 150 155 160 

Tyr Gin Tyr Trp Ser Val Arg Gin ser Lys Arg Thr Gly Gly Thr He 

165 170 175 

Thr Thr Gly Asn His Phe Asp Ala Trp Ala ser Tyr Gly Met Asn Leu 

180 185 190 

Gly Ala His Asn Tyr Gin lie Met Ala Thr Glu Gly Tyr Gin ser Ser 

195 200 205 

Gly Ser Ser Asp lie Thr Val Ser Glu Gly Ser Ser Ser Ser ser Ser 

210 215 220 

Gly Gly Gly Ser Ser Thr ser Ser Ser Gly Gly Gly Gly Thr Lys Ser 
225 230 235 240 

Phe Thr Val Arg Ala Arg Gly Thr Ala Gly Gly Glu Ser lie Thr Leu 

245 250 255 

Arg Val Asn Asn Gin Asn val Gin Thr Trp Thr Leu Gly Thr Ser Met 

260 265 270 

Thr Asn Tyr Thr Ala Ser Thr Ser Leu Ser Gly Gly lie Thr Val Ala 

275 280 285 

Tyr Thr Asn Asp Ser Gly Asn Arg Asp Val Gin Val Asp Tyr lie lie 

290 295 300 

val Asn Gly Ser Thr Arg Gin Ser Glu Ala Gin Ser Tyr Asn Thr Gly 
305 310 315 320 

Leu Tyr Ala Asn Gly ser cys Gly Gly Gly ser Asn ser Glu Trp Met 

325 330 335 

His Cys Asn Gly Ala lie Gly Tyr Gly Asn Thr Pro 
340 345 

<210> 301 
<211> 642 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 301 

atgtttaagt ttacaaagaa attcttagtt gggttaacgg cagctttgat gagtatgagc 60 

ttgttttcgg caaacgcctc tgcagctaac acagactact ggcaaaattg gactgatggg 120 

ggcggaacag taaacgctgt caatgggtct ggcgggaatt acagtgtgaa ttggtctaat 180 

accggaaatt tcgttgttgg taaaggttgg actacaggtt cgccatttag gacgataaac 240 

tataatgccg gagtttgggc gccgaacggc aatgcatatt tgactttata tggttggacg 300 

cgatcccctc tcatagaata ttatgtagtg gattcatggg gtacttatag acctactgga 360 

acgtataaag gtacggttta cagtgatggg ggtacatatg acgtgtacac aactacacgt 420 

tatgatgcac cttccattga tggcgataaa actactttta cgcagtactg gagtgttcgc 480 

cagtcgaaga gaccaactgg aagcaacgct acaatcactt tcagcaatca cgttaacgca 540 

tggaagagat atgggatgaa tctgggtagt aattggtctt accaagtctt agcgacagag 600 

ggatatcgaa gtagtggaag ttctaacgta acagtgtggt aa 642 

<210> 302 
<211> 213 
<212> PRT 
<213> Unknown 
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<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(28) 

<400> 302 

Met Phe Lys Phe Thr Lys Lys Phe Leu val Gly Leu Thr Ala Ala Leu 

1 5 10 15 

Met ser Met ser Leu Phe ser Ala Asn Ala ser Ala Ala Asn Thr Asp 

20 25 30 

Tyr Trp Gin Asn Trp Thr Asp Gly Gly Gly Thr val Asn Ala val Asn 

35 40 45 

Gly Ser Gly Gly Asn Tyr Ser Val Asn Trp ser Asn Thr Gly Asn Phe 

50 55 60 

val Val Gly Lys Gly Trp Thr Thr Gly Ser Pro Phe Arg Thr lie Asn 
65 70 75 ^ 80 

Tyr Asn Ala Gly val Trp Ala Pro Asn Gly Asn Ala Tyr Leu Thr Leu 

, 85 90 95 

Tyr Gly Trp Thr Arg Ser Pro Leu lie Glu Tyr Tyr val val Asp ser 

100 105 110 

Trp Gly Thr Tyr Arg Pro Thr Gly Thr Tyr Lys Gly Thr val Tyr ser 

115 120 125 

Asp Gly Gly Thr Tyr Asp Val Tyr Thr Thr Thr Arg Tyr Asp Ala Pro 

130 135 140 

Ser lie Asp Gly Asp Lys Thr Thr Phe Thr Gin Tyr Trp Ser Val Arq 
145 150 155 160 

Gin ser Lys Arg Pro Thr Gly ser Asn Ala Thr He Thr Phe ser Asn 

165 170 175 

His Val Asn Ala Trp Lys Arg Tyr Gly Met Asn Leu Gly Ser Asn Trp 



<210> 303 
<211> 1404 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 303 

ttgactataa aggcctggcc cgggacggct gccagctata ataccaatag aggttttatc 60 

atgtcctacg ctcagtttaa gggggccgct accctagcga cgtccttcct gctcgcagtc 120 

accttgacag cctgtggagg cagcaaatcc aaacccgttc tgccagaccc atcgaacagc 180 

agctcgtcat caagcagcag ctcgtcatca agcagcagct cctcaagttc ctccagtagc 240 

agttcgagct cttctagtgc tccctccagc caaacgttct tcattgagcc ggatttccag 300 

cttcacaccc tggcggactt cccgattgga gtggcagtct cggcagccaa tgagccatac 360 

agcatcttca accaaaccga tggtactgat cggcaggatg tgatcctgga gcatttcaac 420 

gaaatgaccg ctggcaacat catgaaaatg agctacgtgt acgcaggtca acgtgcaaat 480 

cagcaacccg atcaattcga cttcagcaga gctgatgagc tggttgggtt tgcccacgca 540 

aacagtgtga agattcacgg tcacgccctc gtttggcacg ccgactatca agttccgggt 600 

ttcatgcaga attatgatgg cgactttgct gagatgttgg ccaatcacgc gcggagtgtt 660 

gtggaacatt ttgacgaaga gtttccaggt accgtggtca gctgggatgt ggtcaacgag 720 

gcgataaccg acaacttcgg aaccgataca aatggctggc gccggtcgct gttttacaac 780 

gcgctgccgc ccgcgacaga agacgatatt cctgagtaca tccgcgttgc cttccaggcc 840 

gctcgcgatg ccaacccgga catcgacctc tattacaatg attacgacaa taccgccaac 900 

accaaccggc tgaacaaaac cctgcagatc gccgatgccc tggccgagga cgagctgatc 960 

gacggtgtgg gattccagat gcacgtctat atgacgtacc cgagccttag tcacttccaa 1020 

aacgcgtttc aagaagtggt tgatcgaggc ttgaaggtga agatcaccga gctggacgta 1080 

tcggtggtca acccatacgg tcagagcact ccgccaccgc agcccgtcta cgatgaagcg U40 

ttggcaggcg cacagaaaaa gcggttctgc gatatcacca gagtctatct ggaaacggtt 1200 

ccggctgagc ttcgcggcgg tctcactgtt tgggggcttg ccgacaacga aagctggttg 1260 

atgcaacagt tcaggaacgc aacgggagcg aactacaccg acgtgtggcc gttgttgttc 1320 

aacgccgacc tgtcagccaa acctacactc caaggcgtgg ccgatgctct gcagggtctc 1380 
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ccctgcacca ccgacctcga ctaa 1404 

<210> 304 
<211> 467 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(74) 

<400> 304 

Met Thr lie Lys Ala Trp Pro Gly Thr Ala Ala Ser Tyr Asn Thr Asn 

1 5 10 15 

Arg Gly Phe lie Met Ser Tyr Ala Gin Phe Lys Gly Ala Ala Thr Leu 

20 25 30 

Ala Thr Ser Phe Leu Leu Ala Val Thr Leu Thr Ala Cys Gly Gly Ser 

35 40 45 

Lys ser Lys Pro val Leu Pro Asp Pro ser Asn Ser ser Ser ser ser 

50 55 60 

Ser ser Ser Ser ser Ser Ser Ser Ser ser Ser Ser Ser ser ser Ser 
65 70 75 80 

Ser ser Ser Ser Ser Ser Ala Pro Ser Ser Gin Thr Phe Phe lie Glu 

85 90 95 

Pro Asp Phe Gin Leu His Thr Leu Ala Asp Phe Pro He Gly Val Ala 

100 105 110 

val ser Ala Ala Asn Glu pro Tyr Ser lie Phe Asn Gin Thr Asp Gly 

115 120 125 

Thr Asp Arg Gin Asp Val lie Leu Glu His Phe Asn Glu Met Thr Ala 

130 135 140 

Gly Asn lie Met Lys Met ser Tyr Val Tyr Ala Gly Gin Arg Ala Asn 
145 150 155 160 

Gin Gin Pro Asp Gin Phe Asp Phe Ser Arg Ala Asp Glu Leu val Gly 

165 170 175 

Phe Ala His Ala Asn ser val Lys He His Gly His Ala Leu Val Trp 

180 185 * 190 

His Ala Asp Tyr Gin Val Pro Gly Phe Met Gin Asn Tyr Asp Gly Asp 

195 200 205 

Phe Ala Glu Met Leu Ala Asn His Ala Arg ser Val Val Glu His Phe 

210 215 220 

Asp Glu Glu Phe pro Gly Thr val val ser Trp Asp Val Val Asn Glu 
225 230 235 240 

Ala lie Thr Asp Asn Phe Gly Thr Asp Thr Asn Gly Trp Arg Arg ser 

245 250 255 

Leu Phe Tyr Asn Ala Leu Pro Pro Ala Thr Glu Asp Asp lie Pro Glu 

260 265 270 

Tyr lie Arg Val Ala Phe Gin Ala Ala Arg Asp Ala Asn Pro Asp lie 

275 280 285 

Asp Leu Tyr Tyr Asn Asp Tyr Asp Asn Thr Ala Asn Thr Asn Arg Leu 

290 295 300 

Asn Lys Thr Leu Gin lie Ala Asp Ala Leu Ala Glu Asp Glu Leu lie 
305 310 315 320 

Asp Gly Val Gly Phe Gin Met His Val Tyr Met Thr Tyr Pro ser Leu 

325 330 335 

ser His Phe Gin Asn Ala Phe Gin Glu val Val Asp Arg Gly Leu Lys 

340 345 350 

val Lys lie Thr Glu Leu Asp val ser val val Asn Pro Tyr Gly Gin 

355 360 365 

Ser Thr Pro Pro Pro Gin Pro Val Tyr Asp Glu Ala Leu Ala Gly Ala 

370 375 380 

Gin Lys Lys Arg Phe Cys Asp lie Thr Arg Val Tyr Leu Glu Thr Val 
385 390 395 400 

Pro Ala Glu Leu Arg Gly Gly Leu Thr val Trp Gly Leu Ala Asp Asn 

405 410 415 

Glu ser Trp Leu Met Gin Gin Phe Arg Asn Ala Thr Gly Ala Asn Tyr 

420 425 430 

Thr Asp Val Trp Pro Leu Leu Phe Asn Ala Asp Leu Ser Ala Lys Pro 
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435 440 445 

Thr Leu Gin Gly Val Ala Asp Ala Leu Gin Gly Leu Pro cys Thr Thr 

450 455 460 

Asp Leu Asp 
465 

<210> 305 
<211> 3705 
<212> DNA 
<213> Bacteria 

<400> 305 

atgaagagta ttgtaaacag agttgtatct atcgttacag ctttaataat gatttttggg 60 

acatcactgc tttcacaaca cataagggca tttgctgatg acactaatac aaatctggtt 120 

tctaatgggg actttgagac aggcacaatt gatggctgga ttaagcaagg taatcctaca 180 

ttagaagtaa cgactgaaca agcaattggg caatacagta tgaaagttac gggtagaaca 240 

cagacatatg aaggacctgc atatagcttt ttaggaaaaa tgcagaaagg tgaatcatat 300 

aatgtatcgc ttaaagttag acttgtttct ggacaaaatt cttctaatcc ttttattacc 360 

gtgactatgt ttagagaaga tgacaatggc aagcattatg atacaatagt ttggcaaaaa 420 

caagtttctg aagattcatg gactactgta agcgggactt atacattaga ttatactgga 480 

acattaaaaa cattatacat gtatgtagaa tcacccgatc caacgctgga atactatatt 540 

gatgatgttg tagtgacacc acaaaatcca atacaagtag gaaatgtgat taccaatgga 600 

acttttgaaa atggaaatac ttcaggatgg gttggaacag gctcatctgt tgttaaggca 660 

gtgtatggag tggctcatag cggaggttat agtttattga cgacagggag aacagctaat 720 

tggaatggtc ctagctatga tttgactggc aaaatagtac caggtcaaca atacaatgtt 780 

gatttttggg tgaaatttgt taatggcaat gatacagaac aaataaaggc tactgttaaa 840 

gcgacttcta acaaagacaa ttatatacaa gttaatgatt ttgtaaatgt aaataaaggc 900 

gaatggacag aaataaaagg cagttttact ttacctgtga cagattacag cggtgtcagc 960 

atctatgtag aatctcaaaa tcctacttta gagttttaca ttgatgattt ttctgtaata 1020 

ggtgaaattt caaataatca gattacaata caaaatgata ttccggattt atattcagta 1080 

ttcaaagatt atttccccat cggtgttgca gttgattcga gtagattaaa tgatgctgat 1140 

ccacatgctc aattgactgc taaacatttt aatatgcttg ttgcagaaaa tgccatgaaa 1200 

ccggaaagct tgcagcctac agagggaaac tttacctttg ataatgctga taagattgtt 1260 

gattatgaaa tagcacataa tatgaagatg agaggtcata cattgctttg gcataatcag 1320 

gttccggatt ggtttttcca ggacccatct gatccgtcta aaccagcttc aagggatctg 1380 

ctgcttcaaa gattaagaac gcacataaca actgtgttag atcattttaa aacaaaatac 1440 

ggttctcaaa atccaataat cggatgggat gttgtaaatg aggttcttga tgataatggc 1500 

aatttaagaa attctaagtg gttacaaatt ataggacctg attatataga aaaagctttt 1560 

gaatatgcgc atgaggcaga tccatctatg aaattgttta ttaatgatta caacatcgaa 1620 

aataatggcg ttaaaacaca ggcaatgtat gatttagtga aaaagttaaa aagtgaaggt 1680 

gtgcctataa acggaatagg catgcaaatg cacataagca taaattcaaa tatagacaat 1740 

ataaaagctt ctatagaaaa acttgcatca ttaggtgtgg aaatacaggt aactgaatta 1800 

gatatgaaca tgaatggtaa tgtatctaac gacgcattgc ttaagcaagc gagattgtat 1860 

aaacaattat ttgacttatt taaagcagaa aaacaatata taactgctgt agttttttgg 1920 

ggagtttcag atgatgtaag ttggcttagt aagccaaatg ctccactact ttttgattca 1980 

aagttacagg caaagccagc atactgggca attgtagatc aaggcaaagc catacctgac 2040 

attcaatctg caaaagcttt agaaggatca ccgacgattg gtgcaaatgt tgatagttct 2100 

tggaaacttg taaaaccatt gtatgctaat acttatgtga aaggaactat tggagcaact 2160 

gctgctgtta aatctatgtg ggatactaaa aacttatatt tgttagtaca agtttcagac 2220 

aatactccat ctaataatga tggtatcgag atttttgtgg ataagaatga caacaaatct 2280 

actacctatg aaagtgacga tgaacattat atagttaaga gggatggtac agggagttca 2340 

aatattacaa agtatgtaat gtctaatgct gatggatatg tagcacagat agctattcca 2400 

attgaagaca ttagtcctgt gctgaatgat aaaattggat ttgatatcag aataaatgat 2460 

gaccaaggca gtggcaatat aaatgcgata acagtttgga atgattatac aaacagtcaa 2520 

gatactaata cagcatattt tggagattta gtattatcaa aacctgcaca gattgcaaca 2580 

gctatatatg gcactcctgt tattgacggt aaagtagatg gtatttggaa taatgctgaa 2640 

gctatttcga caaatacatg ggtcttgggt tcaaatggtg ctactgcaac agcaaaaatg 2700 

atgtgggacg ataaatatct ttatatattg gcagatgtaa cagataacaa tttaaataaa 2760 

tccagtgtga atccttatga acaggattct gtggaagttt ttgtagatca gaataatgat 2820 

aagacaactt attatgaaaa tgatgatggg cagtttagag ttaactatga taatgaacaa 2880 

agttttggag gaagcactaa ttcaaatgga tttaagtcgg caacaagtct tacacaaaat 2940 

ggatatattg tagaagaagc tattccttgg acgagtatta ctccgtcaaa tggtactatc 3000 

atagggtttg acttgcaagt taacgatgca gatgaaaatg gtaagaggac aggtattgtc 3060 

acatggtgtg atccaagcgg aaattcttgg caagatactt ctggatttgg aaacttgatg 3120 

cttacaggta agccatctgg tgtccttaaa aaggttgtgg catttaatga cataaaagac 3180 

aattgggcga aagacgtaat agaagtatta gcgtcaaggc acatagtaga agggatgaca 3240 

gacacccagt atgaaccaaa caagacagtg acgagagcag aatttacagc aatgatactg 3300 

aggctattaa acataaaaga agaagcatac agcggagaat ttagcgatgt aaaaagtgga 3360 

gactggtatg caaacgcgat agaagcagca tacaaaacag gaataatcga aggtgacgga 3420 
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aagaacgcaa ggccaaatga cagcataaca agagaagaga tgacagcaat agccatgagg 3480 

gcatacgaga tgctgacaca gtacgaagaa gagaatatag gtgcgacaac atttagcgac 3540 

gacaaatcca taagcgattg ggcaagaaat gtagtggcaa atgcagcgaa attaggaata 3600 

gtaaatggtg agccaaataa cgtatttgca cctaaaggaa atgccacaag agcagaagca 3660 

gcagctatca tatacggctt attagaaaaa acaaataagc tttaa 3705 

<210> 306 
<211> 1234 
<212> PRT 
<213> Bacteria 

<220> 

<221> SIGNAL 
<222> (1).. -(32) 

<400> 306 

Met Lys ser He Val Asn Arg Val Val Ser lie Val Thr Ala Leu lie 

1 5 10 15 

Met He Phe Gly Thr ser Leu Leu Ser Gin His lie Arg Ala Phe Ala 

20 25 30 

Asp Asp Thr Asn Thr Asn Leu val ser Asn Gly Asp Phe Glu Thr Gly 

35 40 45 

Thr lie Asp Gly Trp lie Lys Gin Gly Asn Pro Thr Leu Glu Val Thr 

50 55 60 

Thr Glu Gin Ala He Gly Gin Tyr Ser Met Lys Val Thr Gly Arg Thr 
65 70 75 80 

Gin Thr Tyr Glu Gly pro Ala Tyr ser phe Leu Gly Lys Met Gin Lys 

85 90 95 

Gly Glu ser Tyr Asn val Ser Leu Lys Val Arg Leu Val ser Gly Gin 

100 105 110 

Asn Ser Ser Asn Pro Phe lie Thr Val Thr Met Phe Arg Glu Asp Asp 

115 120 125 

Asn Gly Lys His Tyr Asp Thr lie Val Trp Gin Lys Gin Val Ser Glu 

130 135 140 

Asp ser Trp Thr Thr val ser Gly Thr Tyr Thr Leu Asp Tyr Thr Gly 
145 150 155 160 

Thr Leu Lys Thr Leu Tyr Met Tyr Val Glu Ser Pro Asp Pro Thr Leu 

165 170 175 

Glu Tyr Tyr lie Asp Asp Val Val Val Thr Pro Gin Asn Pro rle Gin 

180 185 190 

Val Gly Asn Val He Thr Asn Gly Thr Phe Glu Asn Gly Asn Thr Ser 

195 200 205 

Gly Trp Val Gly Thr Gly ser ser Val val Lys Ala val Tyr Gly val 

210 215 220 

Ala His Ser Gly Gly Tyr ser Leu Leu Thr Thr Gly Arg Thr Ala Asn 
225 230 235 240 

Trp Asn Gly Pro Ser Tyr Asp Leu Thr Gly Lys He Val Pro Gly Gin 

245 250 255 

Gin Tyr Asn val Asp Phe Trp val Lys Phe Val Asn Gly Asn Asp Thr 

260 265 270 

Glu Gin He Lys Ala Thr Val Lys Ala Thr Ser Asn Lys Asp Asn Tyr 

275 280 285 

He Gin val Asn Asp Phe Val Asn val Asn Lys Gly Glu Trp Thr Glu 

290 295 300 

lie Lys Gly Ser Phe Thr Leu Pro Val Thr Asp Tyr ser Gly Val Ser 
305 310 315 320 

He Tyr val Glu ser Gin Asn Pro Thr Leu Glu Phe Tyr He Asp Asp 

325 330 335 

Phe Ser Val He Gly Glu He Ser Asn Asn Gin He Thr He Gin Asn 

340 345 350 

Asp lie Pro Asp Leu Tyr Ser Val Phe Lys Asp Tyr Phe Pro He Gly 

355 360 365 

val Ala val Asp ser ser Arg Leu Asn Asp Ala Asp Pro His Ala Gin 

370 375 380 

Leu Thr Ala Lys His Phe Asn Met Leu val Ala Glu Asn Ala Met Lys 
385 390 395 400 

Pro Glu Ser Leu Gin Pro Thr Glu Gly Asn Phe Thr Phe Asp Asn Ala 

405 410 415 

Asp Lys He Val Asp Tyr Glu He Ala His Asn Met Lys Met Arg Gly 
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420 425 430 

His Thr Leu Leu Trp His Asn Gin val Pro Asp Trp Phe Phe Gin Asp 

435 440 445 

Pro Ser Asp Pro Ser Lys Pro Ala Ser Arg Asp Leu Leu Leu Gin Arg 

450 455 460 

Leu Arg Thr His He Thr Thr Val Leu Asp His Phe Lys Thr Lys Tyr 
465 470 475 480 

Gly Ser Gin Asn Pro lie lie Gly Trp Asp val val Asn Glu val Leu 

, 485 490 495 

Asp Asp Asn Gly Asn Leu Arg Asn Ser Lys Trp Leu Gin lie lie Gly 

500 505 510 

Pro Asp Tyr He Glu Lys Ala Phe Glu Tyr Ala His Glu Ala Asp Pro 

515 520 525 

ser Met Lys Leu Phe He Asn Asp Tyr Asn He Glu Asn Asn Gly Val 

530 535 540 

Lys Thr Gin Ala Met Tyr Asp Leu val Lys Lys Leu Lys Ser Glu Gly 
545 550 555 560 

val Pro He Asn Gly lie Gly Met Gin Met His He Ser lie Asn ser 

565 570 575 

Asn lie Asp Asn He Lys Ala Ser lie Glu Lys Leu Ala ser Leu Gly 

580 585 590 

val Glu lie Gin val Thr Glu Leu Asp Met Asn Met Asn Gly Asn Val 

595 600 605 

Ser Asn Asp Ala Leu Leu Lys Gin Ala Arg Leu Tyr Lys Gin Leu Phe 

610 615 620 

Asp Leu Phe Lys Ala Glu Lys Gin Tyr lie Thr Ala Val val Phe Trp 
625 630 635 640 

Gly val ser Asp Asp Val Ser Trp Leu Ser Lys Pro Asn Ala Pro Leu 

645 650 655 

Leu Phe Asp ser Lys Leu Gin Ala Lys Pro Ala Tyr Trp Ala lie Val 

, , 660 665 670 

Asp Gin Gly Lys Ala lie Pro Asp He Gin ser Ala Lys Ala Leu Glu 

675 680 685 

Gly Ser Pro Thr lie Gly Ala Asn Val Asp ser Ser Trp Lys Leu val 

690 695 700 

Lys Pro Leu Tyr Ala Asn Thr Tyr Val Lys Gly Thr lie Gly Ala Thr 
7 ?5 710 715 720 

Ala Ala Val Lys Ser Met Trp Asp Thr Lys Asn Leu Tyr Leu Leu val 

725 730 735 

Gin val Ser Asp Asn Thr Pro Ser Asn Asn Asp Gly lie Glu lie Phe 

740 745 750 

val Asp Lys Asn Asp Asn Lys Ser Thr Thr Tyr Glu Ser Asp Asp Glu 

755 760 765 

His Tyr lie val Lys Arg Asp Gly Thr Gly ser Ser Asn lie Thr Lys 

770 775 780 

Tyr Val Met Ser Asn Ala Asp Gly Tyr val Ala Gin lie Ala He Pro 
785 790 795 800 

lie Glu Asp He Ser Pro val Leu Asn Asp Lys lie Gly phe Asp lie 

805 810 815 

Arg He Asn Asp Asp Gin Gly Ser Gly Asn lie Asn Ala lie Thr val 

820 825 830 

Trp Asn Asp Tyr Thr Asn ser Gin Asp Thr Asn Thr Ala Tyr Phe Gly 

835 840 845 

Asp Leu Val Leu Ser Lys pro Ala Gin lie Ala Thr Ala lie Tyr Gly 

. 850 855 860 

Thr pro Val lie Asp Gly Lys Val Asp Gly lie Trp Asn Asn Ala Glu 
865 870 875 880 

Ala He ser Thr Asn Thr Trp Val Leu Gly Ser Asn Gly Ala Thr Ala 

885 890 895 

Thr Ala Lys Met Met Trp Asp Asp Lys Tyr Leu Tyr lie Leu Ala Asp 

900 905 910 

Val Thr Asp Asn Asn Leu Asn Lys ser Ser val Asn Pro Tyr Glu Gin 

9!5 , 920 925 

Asp ser Val Glu Val phe Val Asp Gin Asn Asn Asp Lys Thr Thr Tyr 

930 935 940 

Tyr Glu Asn Asp Asp Gly Gin Phe Arg Val Asn Tyr Asp Asn Glu Gin 
945 950 955 960 

ser Phe Gly Gly ser Thr Asn ser Asn Gly Phe Lys ser Ala Thr ser 
965 970 975 
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Leu Thr Gin Asn Gly Tyr lie Val Glu Glu Ala lie Pro Trp Thr Ser 

980 985 990 

He Thr pro Ser Asn Gly Thr He He Gly Phe Asp Leu Gin Val Asn 

995 1000 1005 

Asp Ala Asp Glu Asn Gly Lys Arg Thr Gly lie Val Thr Trp Cys Asp 

1010 1015 1020 

Pro Ser Gly Asn ser Trp Gin Asp Thr ser Gly Phe Gly Asn Leu Met 
1025 1030 1035 1040 

Leu Thr Gly Lys Pro Ser Gly Val Leu Lys Lys Val Val Ala Phe Asn 

1045 1050 1055 

Asp lie Lys Asp Asn Trp Ala Lys Asp Val lie Glu Val Leu Ala Ser 

1060 1065 1070 

Arg His lie Val Glu Gly Met Thr Asp Thr Gin Tyr Glu Pro Asn Lys 

1075 1080 1085 

Thr Val Thr Arg Ala Glu Phe Thr Ala Met lie Leu Arg Leu Leu Asn 

1090 1095 1100 

lie Lys Glu Glu Ala Tyr Ser Gly Glu Phe Ser Asp Val Lys Ser Gly 
1105 1110 1115 1120 

Asp Trp Tyr Ala Asn Ala lie Glu Ala Ala Tyr Lys Thr Gly lie He 

1125 1130 1135 

Glu Gly Asp Gly Lys Asn Ala Arg Pro Asn Asp Ser He Thr Arg Glu 

1140 1145 1150 

Glu Met Thr Ala lie Ala Met Arg Ala Tyr Glu Met Leu Thr Gin Tyr 

1155 1160 1165 

Glu Glu Glu Asn He Gly Ala Thr Thr Phe ser Asp Asp Lys ser He 

1170 1175 1180 

Ser Asp Trp Ala Arg Asn val val Ala Asn Ala Ala Lys Leu Gly He 
1185 1190 1195 1200 

Val Asn Gly Glu Pro Asn Asn Val Phe Ala Pro Lys Gly Asn Ala Thr 

1205 1210 1215 

Arg Ala Glu Ala Ala Ala lie lie Tyr Gly Leu Leu Glu Lys Thr Asn 
1220 1225 1230 

Lys Leu 



<210> 307 
<211> 3729 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 307 

atgaaatccg ccatcggcga cgctgcgttt atggcggcga tggaacgcaa ctccgacatc 60 

gtgacgatgc agtgctacgc acctatcttt gtcaatgtca atcccggcgg gcggcagtgg 120 

cgccccaatt tgatgggcta cgatgcgtta agcgctttcg gctcgccctc gtattacgcc 180 

atcaaaatgt tcagcaacaa tttgggcgat acgattttga agcccagtct cagcggtgcg 240 

cgcctgccag tttccgttac acaagagcag aaaagcggca cgattttcat taaattggtg 300 

aacccgcaaa cgacgccaca gagcgtaaaa attgatctca aaggcgtgcg ctccgtcgaa 360 

ttcagcggca ccgccactgt tttagctgcc gactccggcg cgcttaactc cattgatgcg 420 

cctaccaaag tcgttcctgt cacgcgcaga attactggaa tcagcccctc gtttgcgcaa 480 

acgctggagc cgtattcgat tactgttttg caaatcaagg ccactgctct gccaacggcg 540 

acagcaaacg ccgttgcgcc gccaaccttc accacggagc caaaagtgaa caccaccacg 600 

cccgttacta ttcccgttgc gacttcgcag ccgcagccag tttcgcaacc gtcgccagat 660 

gcaaacgcta tcgcgccgct gaaaaacgct ttcaaaggca agttcctcat tggcaccgtg 720 

ttgagcgggc cagacctgcg cggccagcaa acgcgcagcg tgggtatcgc caccacgcat 780 

ttcgacgcct ttacggcaga aaacgaaatg aagccggatg cgatgcagcc tcgcgaagga 840 

caattcaact ttgctgccgg cgaccgctta gtggaactgg ccgaaaaaag cggcgccacg 900 

cccatcggcc acacgctaat ctggcactcg caaacaccac gctggttctt tgaagggccg 960 

gacggacaac cggcgaatcg tgaattggcc ttggcgcgga tgcgcaagca tattgcaacc 1020 

gtcgtcggcc actacaaagg gcgcgtcaag cagtgggacg tggtgaacga agccataaac 1080 

gatggccccg gcgtgctgcg ccaaagcccg tggctgcgcg ctatcggcga agactacatc U40 

gctgaagcgt ttcgagccgc gcacgccgcc gaccctgacg caattctcat ctataacgat 1200 

tacaacatcg aaatgggcta caaacggccc aaagcaatcc aactcttaaa atctctggtt 1260 

gatcagaaag tgccgattca tgcggttggt attcagggcc actggcgtat ggacaccaac 1320 

ctcaccgaag tggaacaagc tattaaagaa ttctcggcgc tgggcctgaa ggtgatgatc 1380 

actgaactcg acatcggtgt tttgccgacg cgttatcagg gcgctgatat ttcgcaggtg 1440 

caaaacatga cgcctgaaca gcgcgccgcc gtgaacccat ataccaatgg tttgcccgac 1500 
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gacgtagcgc aaaaacacgc cgataaatat cgccaggcct tcgatatttt cctgcgctac 1560 

aaagatgtca tcgaacgcgt cacgttctgg ggtgtggacg atgctcattc gtggctgaac 1620 

ggtttcccga ttcgcgggcg caccgattac ccattgcttt tcgaccggca gggcaaacct 1680 

aaacccgcct ttttcgccgt gcaaaacctg gctttaggcg tgaccgccgc gccgcaatcg 1740 

aatgcttcgt ctgcacccag agctgtcgct caagctgcgc cggcaaccag caatattcgt 1800 

ggccaggaat ttccacgggt cgaaagcgat ttgcgtgtga cgtttcgcat caaagccccc 1860 

gaagcgcaaa aagtgcagtt cgatttaggc aagccttacg acgcgacccg cgatgctgaa 1920 

ggcaactgga cggcaaccac cgagccgcag gtgcccggct ttcactatta caatttggtg 1980 

attgacggcg tgcgcgtgaa cgacccggcc agcgaaacct tttacggcgc gggccgccag 2040 

atgagcggca tcgaaatccc cgaccctgac agcgcctttt attcgccgca aaacgtgccg 2100 

cacggcgaag tgcgcgagcg ctggtatttt tccaacacca cgcaggcgtg gcgtcgcatt 2160 

ttcatttata cgccgccggg ttatgacacg aatcaagtgg agcgttttcc cgttttgtat 2220 

ttacagcatg gcggcggcga agatgagcgc ggctggcctc aacagggtcg catgagcttt 2280 

atcatggata atctcatcgc cacacgtaaa gccaaaccga tgcttgtggt gatggaacaa 2340 

ggctatgcgc gcaagcccaa cgagccgcag gtgccgttgc gcccgcccgg cggtagcgcc 2400 

ggagccatgc cccccgattt caatcgcatg ttcggcacac tgggtgaagt gttcaccaaa 2460 

gacctgattc cctttattga tgctaattac cgcaccaaaa ccgaccgcga aaaccgcgcc 2520 

atggccggac tttcgatggg cgggatgcaa agtttcctta ttggcttgtc gaacaccgat 2580 

ttattcgcgc acatcggagg cttcagcggc gcgggcggcg gtttcggtgg cggcaccttc 2640 

gacgcaaaaa cggcgcacgg cggtgtaatg gccgacgccg acgcgttcaa caaaaaagtt 2700 

cgcacgctgt ttctcagcat cggcacagcg gaaaacgaac gctttcagag cagcgtgcgc 2760 

ggttaccgcg atgcgctgac caaagcgggc atcaaaacca cattctacga atcgcccggc 2820 

acctcacatg aatggctgac atggcggcgt agcctgaaag aattcgcacc gctcttgttt 2880 

caagaagtcg aagtgcaaat tgagcgcggc ccgaatgcgc gcccaattgc gccgcaaccg 2940 

attaatctcg gccccgacga taaacccgca tttcccccgg tgcccgccgg tttcgatgtc 3000 

cgccgcaacg atattccgca tggcgaaatc aaactcgtgg aatatccatc cgctacagtg 3060 

ggcacaaatc gtaagatgca ggtttacacg ccgcccggtt acaatccgca agaaaagtat 3120 

gcggtgcttt atctgttgca cggaatcggc ggcgacgagt gggaatggaa aaatggcggc 3180 

acgccggaag tgattctcga taatctttac gccgcgaaaa aactccagcc gatgattgtg 3240 

gtcatgccca atggccgcgc gcaaaaagat gaccgtccaa tcggcaatgt gttcgcatca 3300 

gcgcctgctt ttgaaacctt cgagaaagat ttactcaacg acgtaattcc gtttatcgaa 3360 

aagaattatc cagtcaaaac cggcgccgaa aatcgcgcgc ttgcgggcct ttcgatgggc 3420 

ggtgggcaat cgctgaactt tggtctgggc aatctcgaca cctttgcgtg ggtcggcggc 3480 

ttttcttcgg cgcccaacac gcgcactggc gcaaggctat tagccaatcc cgacgatgcc 3540 

aaaaagaagc tgaaattgtt atgggtttcg tgtggcgata aagacggctt gtttttcatc 3600 

agccagcgca cgcatcgcta tctcgccgaa aacaatgtgc cacacgtctg gcatgtgcag 3660 

cccggcggcc acgatttccg agtgtggaag caagaccttt ataacttttc gcaactgctg 3720 

ttccgctaa 3729 

<210> 308 
<211> 1242 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 308 

Met Lys ser Ala lie Gly Asp Ala Ala phe Met Ala Ala Met Glu Arg 

1 5 10 15 

Asn Ser Asp He val Thr Met Gin cys Tyr Ala Pro lie Phe Val Asn 

20 25 30 

Val Asn Pro Gly Gly Arg Gin Trp Arg Pro Asn Leu Met Gly Tyr Asp 

35 40 45 

Ala Leu ser Ala Phe Gly Ser Pro ser Tyr Tyr Ala He Lys Met Phe 



Arg Leu Pro Val Ser Val Thr Gin Glu Gin Lys Ser Gly Thr lie Phe 

85 90 95 

lie Lys Leu Val Asn Pro Gin Thr Thr Pro Gin Ser val Lys lie Asp 

100 105 110 

Leu Lys Gly val Arg ser Val Glu Phe Ser Gly Thr Ala Thr Val Leu 
n , 115 120 125 

Ala Ala Asp ser Gly Ala Leu Asn Ser lie Asp Ala Pro Thr Lys val 

130 135 140 

Val pro Val Thr Arg Arg lie Thr Gly lie ser Pro ser Phe Ala Gin 
145 150 155 160 

Thr Leu Glu Pro Tyr Ser He Thr val Leu Gin lie Lys Ala Thr Ala 
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Leu Pro Thr 

Glu Pro Lys 
195 

ser Gin Pro 

210 
Ala pro Leu 
225 

Leu ser Gly 

Ala Thr Thr 

Asp Ala Met 
275 

Arg Leu Val 

290 
Thr Leu lie 
305 

Asp Gly Gin 

His He Ala 

Asp val val 
355 

ser Pro Trp 

370 
Arg Ala Ala 

385 

Tyr Asn He 

Lys ser Leu 

Gly His Trp 
435 

Lys Glu Phe 

450 
lie Gly val 
465 

Gin Asn Met 

Gly Leu Pro 

Ala Phe Asp 
515 

Phe Trp Gly 

530 
Arg Gly Arg 
545 

Lys Pro Ala 

Ala Pro Gin 

Ala Pro Ala 
595 

Ser Asp Leu 

610 
Val Gin Phe 
625 

Gly Asn Trp 

Tyr Asn Leu 

Thr Phe Tyr 
675 

Pro Asp Ser 

690 
Arg Glu Arg 
705 



165 
Ala Thr Ala 
180 

Val Asn Thr 

Gin Pro Val 

Lys Asn Ala 
230 

Pro Asp Leu 

245 
His Phe Asp 
260 

Gin Pro Arg 

Glu Leu Ala 

Trp His ser 
310 

Pro Ala Asn 

325 
Thr val val 
340 

Asn Glu Ala 
Leu Arg Ala 

His Ala Ala 

390 

Glu Met Gly 

405 
val Asp Gin 
420 

Arg Met Asp 

ser Ala Leu 

Leu Pro Thr 
470 

Thr Pro Glu 

485 
Asp Asp Val 
500 

lie Phe Leu 

val Asp Asp 

Thr Asp Tyr 
550 

Phe Phe Ala 

565 
Ser Asn Ala 
580 

Thr ser Asn 

Arg val Thr 

Asp Leu Gly 
630 

Thr Ala Thr 

645 
val lie Asp 
660 

Gly Ala Gly 

Ala Phe Tyr 

Trp Tyr Phe 
710 



Asn Ala val 
185 

Thr Thr Pro 

200 
Ser Gin Pro 
215 

Phe Lys Gly 

Arg Gly Gin 

Ala Phe Thr 
265 

Glu Gly Gin 

280 
Glu Lys ser 
295 

Gin Thr Pro 



170 

Ala Pro Pro 

val Thr lie 

Ser Pro Asp 
220 

Lys Phe Leu 

235 
Gin Thr Arg 
250 

Ala Glu Asn 
Phe Asn Phe 



Gly Ala Thr 
300 

Arg Trp Phe 
315 

Arg Glu Leu Ala Leu Ala 
330 

Lys Gly Arg 



Thr Phe 
190 
Pro Val 
205 

Ala Asn 

lie Gly 

ser Val 

Glu Met 
270 
Ala Ala 
285 

Pro lie 
Phe Glu 
Arg Met 



Gly His Tyr 
345 
lie Asn Asp 

360 
lie Gly Glu 
375 

Asp Pro Asp 

Tyr Lys Arg 

Lys Val Pro 
425 

Thr Asn Leu 

440 
Gly Leu Lys 
455 

Arg Tyr Gin 



Val Lys 
350 

Gly Pro Gly val Leu 
365 

Ala Glu 



Asp Tyr lie 
380 

Ala He Leu 

395 

Pro Lys Ala 
410 

lie His Ala 
Thr Glu Val 



Val Met He 
460 

Gly Ala Asp 
475 

Gin Arg Ala Ala val Asn 
490 

His Ala Asp 



Ala Gin Lys 
505 
Arg Tyr Lys 

520 
Ala His ser 
535 

Pro Leu Leu 

val Gin Asn 

ser Ser Ala 
585 

lie Arg Gly 

600 
Phe Arg lie 
615 

Lys Pro Tyr 

Thr Glu Pro 

Gly val Arg 
665 

Arg Gin Met 

680 
Ser Pro Gin 
695 

Ser Asn Thr 



Asp Val He 

Trp Leu Asn 
540 

phe Asp Arg 
555 

Leu Ala Leu 
570 

Pro Arg Ala 

Gin Glu Phe 

Lys Ala Pro 
620 

Asp Ala Thr 

635 
Gin val Pro 
650 

val Asn Asp 

Ser Gly lie 

Asn val Pro 
700 

Thr Gin Ala 

715 
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lie Tyr 

lie Gin 

val Gly 
430 
Glu Gin 
445 

Thr Glu 

He Ser 

pro Tyr 

Lys Tyr 
510 
Glu Arg 
525 

Gly Phe 

Gin Gly 

Gly val 

Val Ala 
590 
pro Arg 
605 

Glu Ala 

Arg Asp 

Gly Phe 

Pro Ala 
670 
Glu lie 
685 

His Gly 
Trp Arg 



175 

Thr Thr 

Ala Thr 

Ala lie 

Thr val 
240 
Gly lie 
255 

Lys Pro 

Gly Asp 

Gly His 

Gly Pro 
320 
Arg Lys 
335 

Gin Trp 

Arg Gin 

Ala Phe 

Asn Asp 
400 
Leu Leu 
415 

He Gin 

Ala He 

Leu Asp 

Gin val 
480 
Thr Asn 
495 

Arg Gin 

Val Thr 

Pro lie 

Lys Pro 
560 
Thr Ala 
575 

Gin Ala 

val Glu 

Gin Lys 

Ala Glu 
640 
His Tyr 
655 

ser Glu 

Pro Asp 

Glu Val 

Arg He 
720 
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Phe He Tyr Thr Pro Pro Gly Tyr Asp Thr Asn Gin Val Glu Arg Phe 

725 730 735 

Pro Val Leu Tyr Leu Gin His Gly Gly Gly Glu Asp Glu Arg Gly Trp 

740 745 750 

Pro Gin Gin Gly Arg Met ser Phe lie Met Asp Asn Leu lie Ala Thr 

755 760 765 

Arg Lys Ala Lys Pro Met Leu Val Val Met Glu Gin Gly Tyr Ala Arg 

770 775 780 

Lys Pro Asn Glu Pro Gin Val Pro Leu Arg Pro Pro Gly Gly ser Ala 
785 790 795 800 

Gly Ala Met Pro Pro Asp Phe Asn Arg Met Phe Gly Thr Leu Gly Glu 

805 810 815 

val Phe Thr Lys Asp Leu lie Pro Phe lie Asp Ala Asn Tyr Arg Thr 

820 825 830 

Lys Thr Asp Arg Glu Asn Arg Ala Met Ala Gly Leu ser Met Gly Gly 

835 840 845 

Met Gin Ser Phe Leu lie Gly Leu Ser Asn Thr Asp Leu Phe Ala His 

850 855 860 

lie Gly Gly Phe Ser Gly Ala GTy Gly Gly Phe Gly Gly Gly Thr Phe 
865 870 875 880 

Asp Ala Lys Thr Ala His Gly Gly val Met Ala Asp Ala Asp Ala Phe 

885 890 895 

Asn Lys Lys Val Arg Thr Leu Phe Leu ser lie Gly Thr Ala Glu Asn 

900 905 910 

Glu Arg Phe Gin Ser ser Val Arg Gly Tyr Arg Asp Ala Leu Thr Lys 

915 920 925 

Ala Gly lie Lys Thr Thr Phe Tyr Glu Ser Pro Gly Thr Ser His Glu 

930 935 940 

Trp Leu Thr Trp Arg Arg ser Leu Lys Glu phe Ala Pro Leu Leu Phe 
945 950 955 960 

Gin Glu Val Glu Val Gin lie Glu Arg Gly Pro Asn Ala Arg Pro lie 

965 970 975 

Ala pro Gin Pro lie Asn Leu Gly Pro Asp Asp Lys Pro Ala Phe Pro 

980 985 990 

Pro val pro Ala Gly Phe Asp val Arg Arg Asn Asp He Pro His Gly 

995 1000 1005 

Glu lie Lys Leu Val Glu Tyr Pro ser Ala Thr val Gly Thr Asn Arg 

1010 1015 1020 

Lys Met Gin val Tyr Thr Pro Pro Gly Tyr Asn Pro Gin Glu Lys Tyr 
1025 1030 1035 1040 

Ala val Leu Tyr Leu Leu His Gly lie Gly Gly Asp Glu Trp Glu Trp 

1045 1050 1055 

Lys Asn Gly Gly Thr pro Glu val lie Leu Asp Asn Leu Tyr Ala Ala 

1060 1065 1070 

Lys Lys Leu Gin pro Met lie Val val Met Pro Asn Gly Arg Ala Gin 

1075 1080 1085 

Lys Asp Asp Arg Pro lie Gly Asn Val Phe Ala Ser Ala Pro Ala Phe 

1090 1095 1100 

Glu Thr Phe Glu Lys Asp Leu Leu Asn Asp val lie pro Phe lie Glu 
1105 1110 1115 1120 

Lys Asn Tyr Pro val Lys Thr Gly Ala Glu Asn Arg Ala Leu Ala Gly 

1125 1130 ~ 1135 

Leu Ser Met Gly Gly Gly Gin ser Leu Asn Phe Gly Leu Gly Asn Leu 

1140 1145 1150 

Asp Thr Phe Ala Trp val Gly Gly Phe ser ser Ala Pro Asn Thr Arg 

1155 1160 1165 

Thr Gly Ala Arg Leu Leu Ala Asn Pro Asp Asp Ala Lys Lys Lys Leu 

1170 1175 1180 

Lys Leu Leu Trp val Ser cys Gly Asp Lys Asp Gly Leu Phe Phe lie 
1185 1190 11§5 1200 

Ser Gin Arg Thr His Arg Tyr Leu Ala Glu Asn Asn val Pro His Val 

1205 1210 1215 

Trp His Val Gin Pro Gly Gly His Asp Phe Arg Val Trp Lys Gin Asp 

1220 1225 1230 

Leu Tyr Asn Phe ser Gin Leu Leu Phe Arg 
1235 1240 

<210> 309 
<211> 1830 
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<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 309 

ttgaaaaaac tcacaatcgc cctatccctt gcaatcactt ttgccgcgcc agtttttgcg 60 

acagatgctt gtttgcaaaa tactcaatta aatgctaccg cccaaggagc acaaacctgg 120 

actggcaaaa aaggagctac aactttaggc ggttcaggtg acgatgctta tggagttgaa 180 

acttggacag aagctggtgg agacgctact aaatttacat ggtttggacc aaatcagggt 240 

ggtggtttcg cttatagagc ggaatggaca aattccacag attacttagg tcgctttggt 300 

tatttttggg gtattgacgg gaaaaaatgg gacaaattag gagacctttg cgttgattat 360 

aactataaaa gatctgccaa tggcactgga ggttcatatt cttatatagg cgtttatgga 420 

tggacaaacg ctggcggtgg tactgaagct gaatattata tagttgaaga ttggtttgga 480 

gaaaatcaac agactgcaaa taatttaggc aatggttgcc aagagcacgg tgaaattaca 540 

gtggacgaga aaagctataa agttgtcact tgcataagac cagcaggctc tggctgcgta 600 

acttgcaacg gacaacaatt cgggcaagtc tttagcatac gccaaggcat gagaagcgaa 660 

caacctaaaa catgcggaac aatctccatc aaaaagcact ttgaagaatg ggtaaaaatg 720 

acaaccgaaa aaagcggaca atccccagct aaatatattt acgataaaac ttatgaatcc 780 

aaatttttag cggaggcaca aggcggcact ggttggcttg aaaccacttt ccttaaattt 840 

tctagaaacg gtgattgcgg cttcgatatt cctgatggtc atttcaccct tcaattagct 900 

acctctccgt ctgagggcgg tactgtaagc agaaacccaa aggaatcttc ttatgcctcc 960 

ggttcaactg taactctcac agccactccg gcagcaggtt ggaaatttgc cagttggagt 1020 

ggtgatgcat gtcaaaccac aagcccattg acagtcacta tggacaaaaa caaggtaatc 1080 

acagcgaaat ttacccccgt tgtagatctc aataaaaacc ttgttacaaa tggtactttc 1140 

acgaataaag agagttggac ttttaatact ggttccagtt atggcaactc cgaaggaact 1200 

tttgatgtgt caaacagcga aggcagaata aatgtgacga aaattggctc caacccgtgg 1260 

gaaccacagc tcgttcaaaa tggcatcacc cttgttgaag gaatgaatta caaaataact 1320 

tttgaagctt ctgcctctgc tgctcgaaaa ataggcttgg ttatacaaat ggcaggcggc 1380 

gattatacca cttattttga aaaagatata gacttgactg cctcaaatca acaattttcg 1440 

tatgaattca aaatgaatgc accaagcgat gaaagcggtc gtattgggtt taatcttggg 1500 

caaacgactg gaaacgtcac tctaagcaaa atcacccttc attatttaga agaagattca 1560 

cacgaaccgt ccgacaatcc ctgcgaagac ccaagtccaa ttttgaaaaa acgcatccct 1620 

gcaactcatt tctcccttca aacgcttagc gacaaagcct tgcgcataga agtgaacgct 1680 

ccaaccattg tggacatttt tgacctgaga gggaataaag ttaagagttt gaatgtctcc 1740 

ggctcgcaaa cggttaaatt atccctgcca agcggagtgt attttgccaa agcacgtgga 1800 

atgaaaagcg tcagatttgt gttgaggtaa ~ w 1830 

<210> 310 
<211> 609 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)... (20) 

<400> 310 

Met Lys Lys Leu Thr lie Ala Leu ser Leu Ala lie Thr Phe Ala Ala 
1 5 10 15 

Pro val Phe Ala Thr Asp Ala Cys Leu Gin Asn Thr Gin Leu Asn Ala 

20 25 30 

Thr Ala Gin Gly Ala Gin Thr Trp Thr Gly Lys Lys Gly Ala Thr Thr 

35 40 45 

Leu Gly Gly Ser Gly Asp Asp Ala Tyr Gly Val Glu Thr Trp Thr Glu 

50 55 60 

Ala Gly Gly Asp Ala Thr Lys Phe Thr Trp Phe Gly Pro Asn Gin Gly 
65 70 75 80 

Gly Gly Phe Ala Tyr Arg Ala Glu Trp Thr Asn Ser Thr Asp Tyr Leu 

85 90 95 

Gly Arg Phe Gly Tyr Phe Trp Gly lie Asp Gly Lys Lys Trp Asp Lys 

100 105 HO 

Leu Gly Asp Leu cys Val Asp Tyr Asn Tyr Lys Arg ser Ala Asn Gly 

115 120 125 

Thr Gly Gly ser Tyr Ser Tyr He Gly val Tyr Gly Trp Thr Asn Ala 
130 135 140 
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Gly Gly Gly Thr Glu Ala Glu Tyr Tyr lie Val Glu Asp Trp phe Gly 
145 150 155 160 

Glu Asn Gin Gin Thr Ala Asn Asn Leu Gly Asn Gly Cys Gin Glu His 

165 170 175 

Gly Glu lie Thr val Asp Glu Lys ser Tyr Lys val val Thr cys He 

180 185 190 

Arg Pro Ala Gly Ser Gly cys Val Thr Cys Asn Gly Gin Gin phe Gly 

195 200 205 

Gin Val Phe Ser lie Arg Gin Gly Met Arg ser Glu Gin Pro Lys Thr 

210 215 220 

Cys Gly Thr He ser lie Lys Lys His Phe Glu Glu Trp Val Lys Met 
225 230 235 240 

Thr Thr Glu Lys ser Gly Gin ser Pro Ala Lys Tyr lie Tyr Asp Lys 

245 250 255 

Thr Tyr Glu Ser Lys Phe Leu Ala Glu Ala Gin Gly Gly Thr Gly Trp 

260 265 270 

Leu Glu Thr Thr Phe Leu Lys Phe Ser Arg Asn Gly Asp Cys Gly Phe 

275 280 285 

Asp lie pro Asp Gly His Phe Thr Leu Gin Leu Ala Thr ser pro Ser 

290 295 300 

Glu Gly Gly Thr Val Ser Arg Asn Pro Lys Glu Ser ser Tyr Ala Ser 
305 310 315 320 

Gly Ser Thr Val Thr Leu Thr Ala Thr Pro Ala Ala Gly Trp Lys Phe 

325 330 335 

Ala Ser Trp Ser Gly Asp Ala Cys Gin Thr Thr Ser Pro Leu Thr val 

340 345 350 

Thr Met Asp Lys Asn Lys val lie Thr Ala Lys Phe Thr Pro Val val 

355 360 365 

Asp Leu Asn Lys Asn Leu Val Thr Asn Gly Thr Phe Thr Asn Lys Glu 

370 375 380 

Ser Trp Thr Phe Asn Thr Gly Ser Ser Tyr Gly Asn Ser Glu Gly Thr 
385 390 395 400 

Phe Asp Val Ser Asn Ser Glu Gly Arg lie Asn Val Thr Lys He Gly 

405 410 415 

Ser Asn pro Trp Glu Pro Gin Leu val Gin Asn Gly lie Thr Leu val 

420 425 430 

Glu Gly Met Asn Tyr Lys lie Thr Phe Glu Ala Ser Ala ser Ala Ala 

435 440 445 

Arg Lys He Gly Leu Val lie Gin Met Ala Gly Gly Asp Tyr Thr Thr 

450 455 460 

Tyr Phe Glu Lys Asp lie Asp Leu Thr Ala Ser Asn Gin Gin Phe Ser 
465 470 475 480 

Tyr Glu Phe Lys Met Asn Ala Pro Ser Asp Glu Ser Gly Arg lie Gly 

485 490 495 

Phe Asn Leu Gly Gin Thr Thr Gly Asn Val Thr Leu ser Lys lie Thr 

500 505 510 

Leu His Tyr Leu Glu Glu Asp Ser His Glu Pro Ser Asp Asn Pro Cys 

515 520 525 

Glu Asp pro ser Pro lie Leu Lys Lys Arg lie Pro Ala Thr His Phe 

530 535 540 

Ser Leu Gin Thr Leu ser Asp Lys Ala Leu Arg lie Glu val Asn Ala 
545 550 555 560 

Pro Thr He val Asp lie Phe Asp Leu Arg Gly Asn Lys val Lys Ser 

565 570 575 

Leu Asn Val Ser Gly Ser Gin Thr Val Lys Leu Ser Leu Pro ser Gly 

580 585 590 

val Tyr phe Ala Lys Ala Arg Gly Met Lys Ser Val Arg phe Val Leu 
595 600 605 

Arg 

<210> 311 
<211> 3972 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
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<400> 311 

atgcggaaaa gagtaatagc tttatttgta actctcatct ttgtcatgtc tattttaagt 60 

ccaggatatc ttccatttct gagtactaaa gcaaatgctc aaacacaaaa tacaccaaca 120 

attttaaaat ttgattttga aagcggtaat caaggctgga cggggagagg tctttcaaca 180 

actgttgcaa ccgtttacaa tgttgcttat gaaggtgatt attcattaaa agtttctggc 240 

agaaatgctt catgggatgg agctgttatt gatttaacag acaagctttc ggcaaatgtg 300 

agttatacag tttctctgtt tgttcgtcac agtgaccaaa aacctcaaag attttcagta 360 

tatgcatatg taaaagattc agcaagtgaa aaatatattc cagttgtaga taaagttgca 420 

gttcctaatt attggaagca actggttggt aaattcacaa tcaacacttc aaatccagtc 480 

caaaagattc agctgcttgt ctgtgttcct acaaataaat cattggaatt ttttatcgat 540 

agcgtattaa ttgcaagtag tgcaggagca acatctggag ttgtaaaatc cacaaatttt 600 

gaaagcggta caacagaagg ctggcaagca aggggaacag gttctgttgc tcagattagt 660 

gttgtttcta cagtagctca ttcaggtagt aaaagtttgt atgtgacagg gcgagttcaa 720 

acgtggcaag gtgcacaaat agatttgaca agtttgttag agaagggtaa agaatatcag 780 

ttttctgtgt gggtatatca ggatagtgga agcgaccaaa agctgacact gaccatggaa 840 

aggaaaaatg cagatggaag tacaaattat gatacaataa aatggcagca aacagtttca 900 

agcaatacat gggtagagct aacaggttca tatacagtac ctgcaacagc aacacaacta 960 

atattctaca ttgaatcacc caatgctacc ctaagctttt atattgatga ttttactgct 1020 

gttgataaaa atgcaccagt tgtagcgcct ggaattataa aatcagccac atttgaaagc 1080 

ggtacaacag aagactggca agcaagaggg acaggagtga ccgtttctgt tgttaacaca 1140 

gtggcacata ctgggagcaa gagtttgtat gtgacaggga gaagtcaaaa ttggcatggt 1200 

gcagaaattg atctgacaaa tgtgctagag aagggcaagg aatatcaatt ttctgtgtgg 1260 

gtatatcagg atagtggaag cgatcagaag ctgacattga ccatgcaaag gaaaaatgca 1320 

gataacacaa cagattatga ttctataaaa tatcagcaga cagtagcaac aaatacatgg 1380 

gtagagctaa caggttcata tacagtgccg acaacagcca cccagttaat tttatatgtt 1440 

gaagctgcag atactaccct aagcttttat attgatgatt ttactgctgt tgataaaaac 1500 

ccagaggtaa taccaacagt atcgagagta ccagaatggg aaattccttc actctttgag 1560 

cagtatacga attatttcag cattggtgtg gcaataccat ataaagtact tacaaatcca 1620 

accgaaaagg caatggtact caaacatttc aacagtataa cagctgagaa tgaaatgaaa 1680 

cctgatgcta ttcaaaagac agaagggaat tttacattta atgttgcaga ccaatacgta 1740 
gattttgcac agcaaaatag aattggaatc agaggtcaca ctcttgtttg gcatcagcaa . 1800 

acaccaaatt ggttcttcca gcatagtgat ggtactccgc ttgatccaag caatcctgct 1860 

gacaaacaac ttctacgcga tagattaaga acgcatatcc aaacacttgt tggaagatac 1920 

gcagggaaaa tttatgcatg ggatgttgta aacgaggcaa ttgacgagaa ccagccagat 1980 

ggatacagaa gaagtgaatg gtacagaata ttggggccaa ctgatacaac agatggcatt 2040 

ccagaatata ttctgcttgc attccagtat gcaagagagg cagacccaaa tactaagctc 2100 

ttttataacg actataatac agaaaatcca aagaaaagac agtttatata caatcttgtt 2160 

aagaagctca aagaaagagg cttgattgat ggtgtaggtc tgcagtgcca tattaatgtc 2220 

gattcaccta cagttaaaga gatagaggat acaattaaac tgtttagtac aatccctggc 2280 

ttagacattc acattacaga gcttgacatt agcgtttata caagcagcag ccagagatat 2340 

gatactcttc cacaggatat aatgataaaa caggctttga agttcaaaga gctttttgag 2400 

atgctaaaaa gatatagcta tgttgtcaca aacgttactt tctggggact caaagatgac 2460 

tattcatggc tttcaacaag cagatctaac tggccactac tgtttgacaa caactaccag 2520 

gcaaaatttg catactgggc aattgttgaa ccgtcagtat tgccacttgc tataaacaaa 2580 

ggatatgcaa acaatgcatc agcaaggata gatggagttt tagacagaga atacaaaggt 2640 

gcgattccaa ttaagattac aaatgaaagt ggacaagaag ttgcaactgt tcgagctcta 2700 

tggaattcaa gtgaactcag cctctatata tcggtcaatg atacaacaat agatgctgct 2760 

aatgataaag tagttgtatt tgtagaccag gataatggaa aaatgccaga aattaaacct 2820 

gatgactatt gggtttcaat tacgagaact ggtacaaaag cacaatcagc tcaaggctat 2880 

gtaaaggatt atgctgtcgt gcagcaagca aatggatatg tggttgagtt gaagctttta 2940 

attaataaca cgttaactgt taactcttct ataggttttg atatagcaat ctttgacaat 3000 

ggagttcaat acagctggaa tgacaagaca aactcacagt ttatagaaac tgataactat 3060 

ggtattttaa caatggcaga tagcgtcaag tttgcttctg ctccaaaagg tacagcaata 3120 

attgatgcag aattagatga tacatggaaa aacgctcagg aaataacaac tgacacaaag 3180 

gtcacggtta caggcacagt atacgactca gcttatgcaa aggctaagat gatgtgggat 3240 

gaaaatagta tctatgtcta tgcaattgtt tatgacttgc ttttgaacaa ggctaataca 3300 

aatccatggg agcaggattc aattgagata tttgtggatg aaaataatca caaaacgcct 3360 

tactatgaaa atgatgatgt tcagtacaga gtgaactatg agaatactca aacatttggc 3420 

acgaacggtg ctcctcagaa cttcattaca gcaacaaaga taattccaaa cggatatata 3480 

gtggaagctc aagtttacat gaggacgaca aagctttctg aaggaatggt tataggcttt 3540 

gacattcaag tgaatgatgc agaccataca ggtaaaagag tcggtgttct aacctggaat 3600 

gataaggttg ggaacaatta tagagacaca acaaggttta gatgcttaga gcttgtagca 3660 

gcacctgtaa gccagccacc aatacaagct ccatcaccat cacaaccaac aacaataacg 3720 

tatatactaa caccgacacc aacacagcca tcaacccaaa cacagcagca acctgctcag 3780 

caaccatcac agcagcaaca gcaaccgcaa cagcagcagc ctgcacagac acaacaacct 3840 

cagacacagc ctgcacaaaa gcctcagaat gttgtttcga taaagataga ccagacaaaa 3900 

gctgagacat ttactgttgg cgctgatacc aaggttgttg tacctcaagg ttctgtaact 3960 

ggtgcaaact ga 3972 
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<210> 312 
<211> 1323 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(33) 

<400> 312 

Met Arg Lys Arg Val lie Ala Leu Phe val Thr Leu lie Phe val Met 
15 10 15 

Ser lie Leu Ser Pro Gly Tyr Leu Pro Phe Leu Ser Thr Lys Ala Asn 

20 25 30 

Ala Gin Thr Gin Asn Thr Pro Thr lie Leu Lys Phe Asp Phe Glu Ser 

35 40 45 

Gly Asn Gin Gly Trp Thr Gly Arg Gly Leu Ser Thr Thr val Ala Thr 

50 55 60 

Val Tyr Asn val Ala Tyr Glu Gly Asp Tyr Ser Leu Lys val Ser Gly 
65 70 75 80 

Arg Asn Ala Ser Trp Asp Gly Ala val lie Asp Leu Thr Asp Lys Leu 

85 90 95 

Ser Ala Asn val Ser Tyr Thr Val Ser Leu Phe Val Arg His ser Asp 

100 105 110 

Gin Lys Pro Gin Arg phe ser val Tyr Ala Tyr val Lys Asp ser Ala 

115 120 125 

Ser Glu Lys Tyr lie pro Val Val Asp Lys Val Ala Val Pro Asn Tyr 

130 135 140 

Trp Lys Gin Leu Val Gly Lys Phe Thr lie Asn Thr Ser Asn Pro Val 
145 150 155 160 

Gin Lys lie Gin Leu Leu Val Cys val Pro Thr Asn Lys Ser Leu Glu 

165 170 175 

Phe Phe lie Asp Ser val Leu He Ala ser Ser Ala Gly Ala Thr Ser 

180 185 190 

Gly Val Val Lys ser Thr Asn Phe Glu ser Gly Thr Thr Glu Gly Trp 

195 200 205 

Gin Ala Arg Gly Thr Gly ser Val Ala Gin lie Ser val Val ser Thr 

210 215 220 

val Ala His Ser Gly Ser Lys Ser Leu Tyr Val Thr Gly Arg Val Gin 
225 230 235 240 

Thr Trp Gin Gly Ala Gin lie Asp Leu Thr Ser Leu Leu Glu Lys Gly 

245 250 255 

Lys Glu Tyr Gin Phe ser Val Trp val Tyr Gin Asp Ser Gly Ser Asp 

260 265 270 

Gin Lys Leu Thr Leu Thr Met Glu Arg Lys Asn Ala Asp Gly ser Thr 

275 280 285 

Asn Tyr Asp Thr lie Lys Trp Gin Gin Thr Val ser ser Asn Thr Trp 

290 295 300 

val Glu Leu Thr Gly ser Tyr Thr val Pro Ala Thr Ala Thr Gin Leu 
305 310 315 320 

lie Phe Tyr lie Glu Ser Pro Asn Ala Thr Leu ser Phe Tyr He Asp 

325 330 335 

Asp Phe Thr Ala val Asp Lys Asn Ala Pro Val val Ala pro Gly lie 

340 345 350 

lie Lys ser Ala Thr phe Glu Ser Gly Thr Thr Glu Asp Trp Gin Ala 

355 360 365 

Arg Gly Thr Gly val Thr Val Ser val val Asn Thr val Ala His Thr 

370 375 380 

Gly Ser Lys Ser Leu Tyr Val Thr Gly Arg Ser Gin Asn Trp His Gly 
385 390 395 400 

Ala Glu lie Asp Leu Thr Asn Val Leu Glu Lys Gly Lys Glu Tyr Gin 

405 410 415 

Phe ser Val Trp val Tyr Gin Asp ser Gly Ser Asp Gin Lys Leu Thr 

420 425 430 

Leu Thr Met Gin Arg Lys Asn Ala Asp Asn Thr Thr Asp Tyr Asp Ser 

435 440 445 

lie Lys Tyr Gin Gin Thr Val Ala Thr Asn Thr Trp val Glu Leu Thr 
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450 455 460 

Gly Ser Tyr Thr Val Pro Thr Thr Ala Thr Gin Leu lie Leu Tyr Val 
465 470 475 480 

Glu Ala Ala Asp Thr Thr Leu Ser Phe Tyr lie Asp Asp Phe Thr Ala 

485 490 495 

val Asp Lys Asn Pro Glu val lie Pro Thr Val Ser Arg Val Pro Glu 

500 505 510 

Trp Glu lie Pro ser Leu Phe Glu Gin Tyr Thr Asn Tyr Phe Ser lie 

515 520 525 

Gly val Ala lie Pro Tyr Lys Val Leu Thr Asn Pro Thr Glu Lys Ala 

530 535 540 

Met Val Leu Lys His Phe Asn Ser lie Thr Ala Glu Asn Glu Met Lys 
545 550 555 560 

Pro Asp Ala He Gin Lys Thr Glu Gly Asn Phe Thr Phe Asn Val Ala 

565 570 575 

Asp Gin Tyr val Asp Phe Ala Gin Gin Asn Arg lie Gly lie Arg Gly 

580 585 590 

His Thr Leu Val Trp His Gin Gin Thr Pro Asn Trp Phe Phe Gin His 

595 600 605 

Ser Asp Gly Thr Pro Leu Asp Pro ser Asn Pro Ala Asp Lys Gin Leu 

610 615 620 

Leu Arg Asp Arg Leu Arg Thr His He Gin Thr Leu Val Gly Arg Tyr 
625 630 635 640 

Ala Gly Lys lie Tyr Ala Trp Asp val Val Asn Glu Ala lie Asp Glu 

645 650 655 

Asn Gin Pro Asp Gly Tyr Arg Arg Ser Glu Trp Tyr Arg lie Leu Gly 

660 665 670 

Pro Thr Asp Thr Thr Asp Gly He Pro Glu Tyr He Leu Leu Ala Phe 

675 680 685 

Gin Tyr Ala Arg Glu Ala Asp Pro Asn Thr Lys Leu Phe Tyr Asn Asp 

690 695 700 

Tyr Asn Thr Glu Asn Pro Lys Lys Arg Gin Phe lie Tyr Asn Leu val 
705 710 715 720 

Lys Lys Leu Lys Glu Arg Gly Leu lie Asp Gly Val Gly Leu Gin cys 

725 730 735 

His lie Asn Val Asp ser Pro Thr val Lys Glu He Glu Asp Thr lie 

740 745 750 

Lys Leu Phe ser Thr lie Pro Gly Leu Asp lie His lie Thr Glu Leu 

755 760 765 

Asp lie ser val Tyr Thr ser Ser ser Gin Arg Tyr Asp Thr Leu pro 

770 775 780 

Gin Asp lie Met lie Lys Gin Ala Leu Lys Phe Lys Glu Leu Phe Glu 
785 790 795 800 

Met Leu Lys Arg Tyr ser Tyr Val val Thr Asn Val Thr phe Trp Gly 

805 810 815 

Leu Lys Asp Asp Tyr ser Trp Leu ser Thr ser Arg ser Asn Trp Pro 

820 825 830 

Leu Leu Phe Asp Asn Asn Tyr Gin Ala Lys Phe Ala Tyr Trp Ala lie 

835 840 845 

Val Glu Pro Ser val Leu Pro Leu Ala lie Asn Lys Gly Tyr Ala Asn 

850 855 860 

Asn Ala ser Ala Arg lie Asp Gly Val Leu Asp Arg Glu Tyr Lys Gly 
865 870 875 880 

Ala lie Pro He Lys lie Thr Asn Glu ser Gly Gin Glu val Ala Thr 

885 890 895 

Val Arg Ala Leu Trp Asn Ser Ser Glu Leu ser Leu Tyr lie Ser Val 

900 905 910 

Asn Asp Thr Thr He Asp Ala Ala Asn Asp Lys Val Val Val Phe Val 

915 920 925 

Asp Gin Asp Asn Gly Lys Met Pro Glu lie Lys Pro Asp Asp Tyr Trp 

930 935 940 

Val ser lie Thr Arg Thr Gly Thr Lys Ala Gin ser Ala Gin Gly Tyr 
945 950 955 960 

val Lys Asp Tyr Ala val val Gin Gin Ala Asn Gly Tyr val Val Glu 

965 970 975 

Leu Lys Leu Leu He Asn Asn Thr Leu Thr val Asn Ser ser He Gly 

980 985 990 

Phe Asp He Ala lie Phe Asp Asn Gly val Gin Tyr ser Trp Asn Asp 
995 1000 1005 
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Lys Thr Asn Ser Gin Phe lie Glu Thr Asp Asn Tyr Gly He Leu Thr 

1010 1015 1020 

Met Ala Asp ser val Lys Phe Ala Ser Ala Pro Lys Gly Thr Ala lie 
1025 1030 1035 1040 

He Asp Ala Glu Leu Asp Asp Thr Trp Lys Asn Ala Gin Glu lie Thr 

1045 1050 1055 

Thr Asp Thr Lys Val Thr Val Thr Gly Thr Val Tyr Asp ser Ala Tyr 

1060 1065 1070 

Ala Lys Ala Lys Met Met Trp Asp Glu Asn Ser lie Tyr Val Tyr Ala 

1075 1080 1085 

lie val Tyr Asp Leu Leu Leu Asn Lys Ala Asn Thr Asn Pro Trp Glu 

1090 1095 1100 

Gin Asp ser He Glu lie Phe Val Asp Glu Asn Asn His Lys Thr Pro 
1105 1110 1115 1120 

Tyr Tyr Glu Asn Asp Asp val Gin Tyr Arg Val Asn Tyr Glu Asn Thr 

„ , H25 1130 1135 

Gin Thr Phe Gly Thr Asn Gly Ala Pro Gin Asn Phe He Thr Ala Thr 

1140 1145 1150 

Lys lie lie Pro Asn Gly Tyr lie val Glu Ala Gin val Tyr Met Arg 

, ■ u 55 1160 1165 

Thr Thr Lys Leu Ser Glu Gly Met val lie Gly Phe Asp lie Gin val 

1170 1175 1180 

Asn Asp Ala Asp His Thr Gly Lys Arg Val Gly Val Leu Thr Trp Asn 
1185 1190 1195 1200 

Asp Lys val Gly Asn Asn Tyr Arg Asp Thr Thr Arg Phe Arg cys Leu 

1205 1210 1215 

Glu Leu val Ala Ala Pro val Ser Gin Pro Pro He Gin Ala Pro ser 

1220 1225 1230 

Pro ser Gin Pro Thr Thr lie Thr Tyr lie Leu Thr Pro Thr Pro Thr 

1235 1240 1245 

Gin Pro Ser Thr Gin Thr Gin Gin Gin Pro Ala Gin Gin pro Ser Gin 

1250 1255 1260 

Gin Gin Gin Gin Pro Gin Gin Gin Gin Pro Ala Gin Thr Gin Gin Pro 
1265 1270 1275 1280 

Gin Thr Gin Pro Ala Gin Lys Pro Gin Asn Val val Ser lie Lys lie 

1285 1290 1295 

Asp Gin Thr Lys Ala Glu Thr Phe Thr val Gly Ala Asp Thr Lys val 

1300 1305 1310 

val val Pro Gin Gly Ser Val Thr Gly Ala Asn 
1315 1320 

<210> 313 
<211> 1392 
<212> DNA 
<213> Bacteria 

<400> 313 

gtgaaacgtc tatccgcgct gaccgccgtc gtattgttag cgctaacaac tcacgtcgcc 60 

gccgctgacc ccgcgccacc cgccaccggc cccgccatcg acttccgggc cgaactccag 120 

cccatcgacg gattcggctt ctccatggcc ttccagcggg ccgacctgct gcacggcgcg 180 

cgcggcctca gccccgccaa gcggcgcgag gtgctcgacc tgctgctcga caaggagagg 240 

ggcgcgggcc tgtcgatcct gcgcctgggc atcgggtcgt cgaccgaccg ggtctacgac 300 

cacatgccga cgatcctgcc gaccgatccc ggcgggccgg acgccccgcc gaagtacgtc 360 

tgggacggct gggacggcgg ccaggtctgg ctcgccaagg aggccaaggc gtacggcgtc 420 

aagcggttct tcgccgacgc ctggagcgcg ccggccttca tgaagaccaa cggcagcgag 480 

aacgacggcg gcgagctccg gcccgaatgg cgccaggcct acgcgaacta cctcgtcaag 540 

tacgcgaagt tctaccaacg ggaaggcatc ccgatcaccg acctggggtt caccaacgaa 600 

cccgactggg cggcgaccta cgcctcgatg cgtttcaccc cgcagcaggc cgtcgacttc 660 

ctcaaggtgc tcgggccgac cgtccgcgcg tccggactga agaccggcgt cgtctgctgc 720 

gacgcggcgg gctgggaccg gcaggtcgcc tacaccgagg ccatcgaggc ggaccccgag 780 

gccgccaagg ccgtgcggac cgtcaccggc caccgctaca gcggtccgac cacggtcccg 840 

cagcccaccg acaagcgggt ctggatgtcg gagtggtcac cggacggcac cacctggaac 900 

gagaactggg acgacggcag cggctacgac ggcctcaccg tcgccgccga catccagaac 960 

accctcaccg tcggcaacgc caacgcctac gtctactgga ccggcgcgtc cctcggcgcc 1020 

acccggggac tcatccagct cgccaacccc ggcgactcct accgggtgtc caagcggtac 1080 

tgggcgctgg ccgccttcag ccgcttcatc cgccccgacg ccgtccgcgt accggtcacg 1140 

aacgccgacc cggccctgag cgtcacggcc ttccgcaaca ccgacggcag ccgcgtgatc 1200 

gagatcctca acacggcgac caccgagaag tccgcccagt tcgccctccg cggcggccac 1260 

gaccggcacc ccgagggcta cgtcaccgac gagacccgct cgatcacccc ggcccacgtc 1320 

Page 236 



WO 03/106654 PCT/US03/19153 

gcctccgcgc gcggtacgac cctcaaggcc acgctcgccc cgcgcgcgct gaccacgatc 1380 
gtcctcgact ga - - - 1392 

<210> 314 
<211> 463 
<212> PRT 
<213> Bacteria 

<220> 

<221> SIGNAL 
<222> (1)...(22) 

<400> 314 

Met Lys Arg Leu ser Ala Leu Thr Ala Val val Leu Leu Ala Leu Thr 

Thr His val Ala Ala Ala Asp Pro Ala Pro pro Ala Thr Gly pro Ala 

20 25 30 

lie Asp Phe Arg Ala Glu Leu Gin Pro lie Asp Gly Phe Gly phe Ser 

35 40 45 

Met Ala Phe Gin Arg Ala Asp Leu Leu His Gly Ala Arg Gly Leu Ser 

50 55 60 

Pro Ala Lys Arg Arg Glu val Leu Asp Leu Leu Leu Asp Lys Glu Arg 
65 70 75 80 

Gly Ala Gly Leu Ser lie Leu Arg Leu Gly He Gly ser ser Thr Asp 

85 90 95 

Arg Val Tyr Asp His Met Pro Thr lie Leu Pro Thr Asp Pro Gly Gly 

100 105 110 

Pro Asp Ala Pro Pro Lys Tyr Val Trp Asp Gly Trp Asp Gly Gly Gin 

Val Trp Leu Ala Lys Glu Ala Lys Ala Tyr Gly Val Lys Arg Phe Phe 

, 130 135 140 

Ala Asp Ala Trp Ser Ala Pro Ala Phe Met Lys Thr Asn Gly ser Glu 
145 150 155 * 160 

Asn Asp Gly Gly Glu Leu Arg Pro Glu Trp Arg Gin Ala Tyr Ala Asn 

165 170 175 

Tyr Leu val Lys Tyr Ala Lys Phe Tyr Gin Arg Glu Gly He Pro He 

180 185 190 

Thr Asp Leu Gly Phe Thr Asn Glu Pro Asp Trp Ala Ala Thr Tyr Ala 

195 200 205 

Ser Met Arg Phe Thr Pro Gin Gin Ala Val Asp Phe Leu Lys Val Leu 

210 215 220 

Gly pro Thr val Arg Ala ser Gly Leu Lys Thr Gly Val Val cys Cys 
225 230 235 240 

Asp Ala Ala Gly Trp Asp Arg Gin val Ala Tyr Thr Glu Ala He Glu 

245 250 255 

Ala Asp Pro Glu Ala Ala Lys Ala val Arg Thr val Thr Gly His Arg 

260 265 270 

Tyr Ser Gly Pro Thr Thr Val Pro Gin Pro Thr Asp Lys Arg Val Trp 

275 280 285 

Met ser Glu Trp Ser Pro Asp Gly Thr Thr Trp Asn Glu Asn Trp Asp 

290 295 300 

Asp Gly ser Gly Tyr Asp Gly Leu Thr val Ala Ala Asp lie Gin Asn 
305 310 315 320 

Thr Leu Thr val Gly Asn Ala Asn Ala Tyr Val Tyr Trp Thr Gly Ala 

325 330 335 

ser Leu Gly Ala Thr Arg Gly Leu lie Gin Leu Ala Asn pro Gly Asp 

340 345 350 

Ser Tyr Arg val Ser Lys Arg Tyr Trp Ala Leu Ala Ala Phe Ser Arg 

355 360 365 

Phe lie Arg Pro Asp Ala val Arg val Pro Val Thr Asn Ala Asp Pro 
, 370 375 380 

Ala Leu Ser val Thr Ala Phe Arg Asn Thr Asp Gly ser Arg Val lie 
385 390 395 400 

Glu lie Leu Asn Thr Ala Thr Thr Glu Lys ser Ala Gin phe Ala Leu 

, , . 405 410 415 

Arg Gly Gly His Asp Arg His Pro Glu Gly Tyr Val Thr Asp Glu Thr 

_ 420 425 430 

Arg ser He Thr Pro Ala His val Ala ser Ala Arg Gly Thr Thr Leu 
435 440 445 
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Lys Ala Thr Leu Ala Pro Arg Ala Leu Thr Thr lie Val Leu Asp 
450 455 460 

<210> 315 
<211> 1224 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 315 

atgcggaacg tcgtgcgtaa accattgaca atcggactcg ctttaacact attattgccc 60 

atgggaatga cggcaacatc agcgaagaat gcagattcct atgcgaaaaa acctcacatc 120 

agcgcattga atgccccaca attggatcaa cgctacaaaa acgagttcac gattggtgcg 180 

gcagtagaac cttatcaact acaaaatgaa aaagacgtac aaatgctaaa gcgccacttc 240 

aacagcattg ttgccgagaa cgtaatgaaa ccgatcagca ttcaacctga ggaaggaaaa 300 

ttcaattttg aacaagcgga tcgaattgtg aagttcgcta aggcaaatgg catggatatt 360 

cgcttccata cactcgtttg gcacagccaa gtacctcaat ggttctttct tgacaaggaa 420 

ggcaagccaa tggttaatga aacagatcca gtgaaacgtg aacaaaataa acaactgctg 480 

ttaaaacgac ttgaaactca tattaaaacg atcgtcgagc ggtacaaaga tgacattaag 540 

tactgggacg ttgtaaatga ggttgtgggg gacgacggaa aactgcgcaa ctctccatgg 600 

tatcaaatcg ccggcatcga ttatattaaa gtggcattcc aaacagcgag aaaatatggc 660 

ggcaacaaga ttaaacttta tatcaatgat tacaataccg aagtggaacc aaagcgaagc 720 

gctctttata acttggtgaa gcaattaaaa gaagagggcg ttcctattga cggcatcggc 780 

catcaatccc acattcaaat cggctggcct tctgaagcag aaatcgagaa aacgattaac 840 

atgttcgccg ctctcggctt agacaaccaa atcactgagc ttgatgtgag catgtacggt 900 

tggccgccgc gcgcttaccc gacgtatgac gccattccaa aacaaaagtt tttggatcag 960 

gcagcgcgct atgatcgttt gttcaaactg tatgaaaagt tgagcgataa aattagcaac 1020 

gtcaccttct ggggcatcgc cgacaatcat acgtggctcg acagccgtgc ggatgtgtac 1080 

tatgacgcca acgggaatgt tgtggttgac ccgaacgctc cgtacgcaaa agtggaaaaa 1140 

gggaaaggaa aagatgcgcc gttcgttttt ggaccggatt acaaagtcaa acccgcatat 1200 

tgggctatta tcgaccacaa atag 1224 

<210> 316 
<211> 407 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(28) 

<400> 316 

Met Arg Asn val Val Arg Lys Pro Leu Thr lie Gly Leu Ala Leu Thr 
1 5 10 15 

Leu Leu Leu Pro Met Gly Met Thr Ala Thr ser Ala Lys Asn Ala Asp 

20 25 30 

Ser Tyr Ala Lys Lys Pro His lie Ser Ala Leu Asn Ala Pro Gin Leu 

35 40 45 

Asp Gin Arg Tyr Lys Asn Glu Phe Thr lie Gly Ala Ala val Glu Pro 

50 55 60 

Tyr Gin Leu Gin Asn Glu Lys Asp Val Gin Met Leu Lys Arg His Phe 
65 70 75 ~ 80 

Asn Ser lie Val Ala Glu Asn val Met Lys Pro lie ser lie Gin Pro 

85 90 95 

Glu Glu Gly Lys Phe Asn Phe Glu Gin Ala Asp Arg lie val Lys Phe 

100 105 110 

Ala Lys Ala Asn Gly Met Asp lie Arg phe His Thr Leu val Trp His 

115 120 125 

Ser Gin Val Pro Gin Trp Phe Phe Leu Asp Lys Glu Gly Lys Pro Met 

130 135 140 

Val Asn Glu Thr Asp Pro Val Lys Arg Glu Gin Asn Lys Gin Leu Leu 
145 150 155 160 

Leu Lys Arg Leu Glu Thr His lie Lys Thr lie val Glu Arg Tyr Lys 

165 170 175 

Asp Asp lie Lys Tyr Trp Asp val Val Asn Glu val val Gly Asp Asp 
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180 185 190 

Gly Lys Leu Arg Asn Ser Pro Trp Tyr Gin lie Ala Gly lie Asp Tyr 

195 200 205 

lie Lys Val Ala Phe Gin Thr Ala Arg Lys Tyr Gly Gly Asn Lys lie 

210 215 220 

Lys Leu Tyr lie Asn Asp Tyr Asn Thr Glu val Glu Pro Lys Arg ser 
225 230 235 240 

Ala Leu Tyr Asn Leu val Lys Gin Leu Lys Glu Glu Gly Val Pro lie 

245 250 255 

Asp Gly lie Gly His Gin Ser His lie Gin lie Gly Trp Pro Ser Glu 

260 265 270 

Ala Glu lie Glu Lys Thr lie Asn Met Phe Ala Ala Leu Gly Leu Asp 

275 280 285 

Asn Gin lie Thr Glu Leu Asp val Ser Met Tyr Gly Trp Pro Pro Arg 

290 295 300 

Ala Tyr Pro Thr Tyr Asp Ala lie Pro Lys Gin Lys Phe Leu Asp Gin 
305 310 315 320 

Ala Ala Arg Tyr Asp Arg Leu Phe Lys Leu Tyr Glu Lys Leu Ser Asp 

325 ~ 330 335 

Lys He Ser Asn Val Thr phe Trp Gly lie Ala Asp Asn His Thr Trp 

340 345 350 

Leu Asp ser Arg Ala Asp val Tyr Tyr Asp Ala Asn Gly Asn Val Val 

355 360 365 

Val Asp pro Asn Ala Pro Tyr Ala Lys val Glu Lys Gly Lys Gly Lys 

370 375 380 

Asp Ala Pro Phe Val Phe Gly Pro Asp Tyr Lys Val Lys Pro Ala Tyr 
385 390 395 400 

Trp Ala lie lie Asp His Lys 
405 

<210> 317 
<211> 1695 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 317 

gtggctggaa gctcgctcac gagcaacggc ctctcggcca ttctctcgct ccagtcggac 60 

tggggcagcg gttactgcgc gacggtagaa cttcagaacg tcggcggaac tccgatcacg 120 

gcgtgggagg tccaggtgga gctcgctggg acgaccgtga acagcagcca cagcgcggcg 180 

ttctcctcga caggcacccg cctggtcgcc aagcccttgt cctggaacgc gacgctggca 240 

cccgccgcca agacgacctt cggcttctgc gcggccgctc cgagcgcagc ggcgcgcccg 300 

tccgtggtgc aagtgacagc gaacggctcc gccaccggaa cgggcggaac gagcggcggc 360 

ggcacgggcg gctcgaccgc tacgggcggc tcgaccgcta cgggcggctc cggtgggtcg 420 

accgcgggag tgtgcgcggc aacctacgag gccgagagca tgctccacag caccggcaac 480 

gccatcagcg gcggctggaa catctattcg aacggcaaca tcaccgccac gcactccttc 540 

gcagccggct cgaatcgact caccgtgcac gccaagggcg accaggccaa cggggcgccc 600 

atcatgcgcg tcagcgtggg caacaccgtc gtcggcgagg tgccagtgcc ggtgaccgtg 660 

tggacaccgt actgcttcga ctacgccgcg gcgagcgcag gcgcgcagac cgtcaagatc 720 

gagttcacga acgactacaa tggcggcacc ggcgccgacc gcaatctgca cgtggacaag 780 

gtcgcggtgc agtgcggcgc gagctgcaac agcgggageg gagggggcac cggcggctcg 840 

agcggaagcg gcggcacctc ggccaccggc ggctccgcca gcggtggcgc ggcagggacg 900 

acctgcacga acgttcgtcc cactggaacc gactgggacg cggcgacctg cgacatgtgg 960 

gcctcgcaaa ctagcgagtg cagcgcggcc tggatgatcg acaaccatta ctgcgaccag 1020 

agctgcgggc gctgctcggg cgggagcggg accggtggca cgaacacggg aggcaccggc 1080 

ggtggagtga ccccgagtac ctgcacggag cccaattctc agcagtgctc cacctacaag 1140 

gtcgggactc actgcggcct cacctacgag atctggaccg acggctccgc gggctgcatg 1200 

acgaacacct cctacgggtt cctcgccaat tggagccagg ggaacgcaaa ctacctggct 1260 

cgcaagggcg ttcggcccgg ctcgtcgcga ccggtcgtga cgtacagcgc gaactaccag 1320 

ccgaacggga attcctacct ggggatctac ggttggacgc agaacccgct cgtcgagtac 1380 

tacatcatcg atagctgggg gagctggcgt ccaccgggga cccaggcgat gggcaccgtc 1440 

caggtggacg gcgggaccta cgatatctac cggagcgagc gggtgaacaa gccctcgatc 1500 

gagggcaaca agaccttctg gcagtactgg agcgtccgca cccagaagcg caccagtggg 1560 

accatcaccg tggctccgca cttcgccgcg tgggcggcat ccggactgca gatgggctcc 1620 

ttctacgagg tctccctggt ggtggagggc tacaacagct ccggcagcgc cgacgtaacg 1680 

gtgtcgttcc ggtag " ~ ~ " 1 695 
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<210> 318 
<211> 564 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 318 

Met Ala Gly Ser Ser Leu Thr Ser Asn Gly Leu Ser Ala lie Leu Ser 

15 10 15 

Leu Gin ser Asp Trp Gly ser Gly Tyr Cys Ala Thr val Glu Leu Gin 

20 25 30 

Asn Val Gly Gly Thr Pro He Thr Ala Trp Glu Val Gin Val Glu Leu 

35 40 45 

Ala Gly Thr Thr val Asn ser Ser His ser Ala Ala Phe ser ser Thr 

50 55 60 

Gly Thr Arg Leu Val Ala Lys Pro Leu Ser Trp Asn Ala Thr Leu Ala 
65 70 75 80 

Pro Ala Ala Lys Thr Thr Phe Gly Phe Cys Ala Ala Ala Pro ser Ala 

85 90 95 

Ala Ala Arg Pro Ser val Val Gin Val Thr Ala Asn Gly ser Ala Thr 

100 105 110 

Gly Thr Gly Gly Thr ser Gly Gly Gly Thr Gly Gly ser Thr Ala Thr 

115 120 125 

Gly Gly ser Thr Ala Thr Gly Gly Ser Gly Gly ser Thr Ala Gly Val 

130 135 140 

Cys Ala Ala Thr Tyr Glu Ala Glu Ser Met Leu His ser Thr Gly Asn 
145 150 155 160 

Ala lie ser Gly Gly Trp Asn He Tyr ser Asn Gly Asn He Thr Ala 

165 170 175 

Thr His ser Phe Ala Ala Gly Ser Asn Arg Leu Thr val His Ala Lys 

180 185 190 

Gly Asp Gin Ala Asn Gly Ala Pro lie Met Arg Val ser val Gly Asn 

195 200 " 205 

Thr Val Val Gly Glu Val Pro Val Pro Val Thr Val Trp Thr Pro Tyr 

210 215 220 

cys Phe Asp Tyr Ala Ala Ala Ser Ala Gly Ala Gin Thr val Lys lie 
225 230 235 240 

Glu Phe Thr Asn Asp Tyr Asn Gly Gly Thr Gly Ala Asp Arg Asn Leu 

245 250 255 

His val Asp Lys Val Ala Val Gin cys Gly Ala Ser cys Asn Ser Gly 

260 265 270 

Ser Gly Gly Gly Thr Gly Gly Ser Ser Gly Ser Gly Gly Thr Ser Ala 

275 280 285 

Thr Gly Gly Ser Ala ser Gly Gly Ala Ala Gly Thr Thr cys Thr Asn 

290 295 300 

Val Arg Pro Thr Gly Thr Asp Trp Asp Ala Ala Thr Cys Asp Met Trp 
305 310 315 320 

Ala Ser Gin Thr Ser Glu cys Ser Ala Ala Trp Met lie Asp Asn His 

325 330 335 

Tyr cys Asp Gin ser cys Gly Arg Cys Ser Gly Gly Ser Gly Thr Gly 

340 345 350 

Gly Thr Asn Thr Gly Gly Thr Gly Gly Gly val Thr Pro ser Thr cys 

355 360 365 

Thr Glu Pro Asn Ser Gin Gin Cys ser Thr Tyr Lys Val Gly Thr His 

370 375 380 

Cys Gly Leu Thr Tyr Glu lie Trp Thr Asp Gly Ser Ala Gly cys Met 
385 390 395 400 

Thr Asn Thr ser Tyr Gly Phe Leu Ala Asn Trp Ser Gin Gly Asn Ala 

405 410 415 

Asn Tyr Leu Ala Arg Lys Gly val Arg pro Gly Ser Ser Arg Pro Val 

420 425 430 

Val Thr Tyr Ser Ala Asn Tyr Gin Pro Asn Gly Asn Ser Tyr Leu Gly 

435 440 445 

He Tyr Gly Trp Thr Gin Asn Pro Leu Val Glu Tyr Tyr He He Asp 

450 455 460 

Ser Trp Gly Ser Trp Arg Pro Pro Gly Thr Gin Ala Met Gly Thr Val 
465 470 475 480 
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Gin val Asp Gly Gly Thr Tyr Asp lie Tyr Arg Ser Glu Arg val Asn 

485 490 495 

Lys pro ser He Glu Gly Asn Lys Thr Phe Trp Gin Tyr Trp ser Val 

500 505 510 

Arg Thr Gin Lys Arg Thr ser Gly Thr lie Thr Val Ala Pro His Phe 

515 520 525 

Ala Ala Trp Ala Ala Ser Gly Leu Gin Met Gly Ser phe Tyr Glu Val 

530 535 540 

ser Leu val val Glu Gly Tyr Asn ser ser Gly ser Ala Asp Val Thr 
545 550 555 560 

val ser phe Arg 

<210> 319 
<211> 1095 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 319 

atgaaggtga cccgaacagc tgtcgcgggc attgtcgccg cagcggtcct catcacgatc 60 

ggcacgtcga ccgcgtcggc tgaggatgaa ccaaccagcg agaacacgtc gacggatcag 120 

ccgttgcgcg tcctggcagc caaagccggg atcgcgttcg gcacggccgt cgacatgaac 180 

gcgtacaaca acgacgcgac ctaccgtgag ctcgtcggcc aggagttctc gagcgtcacg 240 

gccgagaacg tcatgaagtg gcagctcctc gagccgcagc gaggggtcta caactggggt 300 

ccggccgatc agctcgtgcg cgtagccaac gagaacggcc agaaggtgcg cgggcacacg 360 

ctcatctggc acaaccagct gcccacctgg cttaccagcg gagtcgcctc cggtgagatc 420 

acaccggacg agctccggca gctcctgagg aaccacatct tcacggtgat gcgccacttc 480 

aagggcgaga tccaccagtg ggatgtcgcc aacgaggtca tcgacgacag cggcaacctg 540 

cgcaacacga tctggctgca gaacctgggt ccgagctaca tcgcggacgc gttccggtgg 600 

gctcgcaagg ccgacccgga cgccgccctc tatctgaacg actacaacgt cgagggcccg 660 

aacgccaagg ccgatgcgta ctacgccctg gtcaagcagc tcctcgccga cgacgtgccg 720 

gtggacggct tcggaataca ggggcacctc ggtgtgcagt tcggcttctg gcccgcgagt 780 

gcggtggccg acaacatggg gcgcttcgag gcactcggcc tgcagacggc ggtcaccgag 840 

gcggatgtcc ggatgatcat gccgcccgac gaggacaagc tggccgcaca ggcacgtggc 900 

tacagcacgt tggtccaggg ctgcctgatg gccaagcgtt gcaggtcgtt caccgtctgg 960 

ggcttcaccg acaagtactc ctgggttccg ggcaccttcc ccggccaggg cgcggcgaac 1020 

ctcctggccg aggacttcca gcccaagccg gcttactacg ccgtccagga tgacctcgcg 1080 

cgcgccggac ggtag * 1095 

<210> 320 
<211> 364 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(27) 

<400> 320 

Met Lys val Thr Arg Thr Ala val Ala Gly He Val Ala Ala Ala Val 
15 10 15 

Leu lie Thr He Gly Thr ser Thr Ala Ser Ala Glu Asp Glu Pro Thr 

2 P 25 30 

Ser Glu Asn Thr Ser Thr Asp Gin Pro Leu Arg val Leu Ala Ala Lvs 

35 40 45 

Ala Gly He Ala Phe Gly Thr Ala Val Asp Met Asn Ala Tyr Asn Asn 

50 § 55 60 

Asp Ala Thr Tyr Arg Glu Leu val Gly Gin Glu Phe Ser ser Val Thr 
65 70 75 80 

Ala Glu Asn Val Met Lys Trp Gin Leu Leu Glu Pro Gin Arg Gly val 

85 90 y 95 

Tyr Asn Trp Gly Pro Ala Asp Gin Leu val Arg val Ala Asn Glu Asn 

100 105 110 

Gly Gin Lys val Arg Gly His Thr Leu He Trp His Asn Gin Leu Pro 
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115 120 125 

Thr Trp Leu Thr Ser Gly val Ala Ser Gly Glu lie Thr Pro Asp Glu 

130 135 140 

Leu Arg Gin Leu Leu Arg Asn His He Phe Thr Val Met Arg His Phe 
145 150 155 160 

Lys Gly Glu lie His Gin Trp Asp val Ala Asn Glu Val lie Asp Asp 

165 170 175 

ser Gly Asn Leu Arg Asn Thr lie Trp Leu Gin Asn Leu Gly pro Ser 

180 185 190 

Tyr lie Ala Asp Ala Phe Arg Trp Ala Arg Lys Ala Asp Pro Asp Ala 

195 200 205 

Ala Leu Tyr Leu Asn Asp Tyr Asn Val Glu Gly Pro Asn Ala Lys Ala 

210 215 220 

Asp Ala Tyr Tyr Ala Leu val Lys Gin Leu Leu Ala Asp Asp Val Pro 
225 230 235 240 

Val Asp Gly Phe Gly lie Gin Gly His Leu Gly Val Gin phe Gly Phe 

245 250 255 

Trp Pro Ala Ser Ala Val Ala Asp Asn Met Gly Arg Phe Glu Ala Leu 

260 265 270 

Gly Leu Gin Thr Ala val Thr Glu Ala Asp Val Arg Met lie Met Pro 

275 280 285 

Pro Asp Glu Asp Lys Leu Ala Ala Gin Ala Arg Gly Tyr ser Thr Leu 

290 295 300 

Val Gin Gly Cys Leu Met Ala Lys Arg cys Arg Ser Phe Thr Val Trp 
305 310 315 320 

Gly Phe Thr Asp Lys Tyr Ser Trp Val Pro Gly Thr Phe Pro Gly Gin 
, , , 325 330 335 

Gly Ala Ala Asn Leu Leu Ala Glu Asp Phe Gin Pro Lys Pro Ala Tyr 

340 345 350 

Tyr Ala Val Gin Asp Asp Leu Ala Arg Ala Gly Arg 
355 360 

<210> 321 
<211> 1608 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 321 

gtggactggt gggacgtgga tattttttcc gcgaaggaaa tcacccaccc gcaactggca 60 

accttcctgg atgcctcacg agaccatcgc aagccggtca tgatcggcga gatgacccca 120 

cgccacgtcg gcgtgatcga ggggcagaaa tgctgggatg aatggtttgg cccgatgatt 180 

gatctgctca aacgtcgccc cgaaatcaag gccacggcct atatcaactg ggaatggcgc 240 

gagtggtccg accgcctcgg cttccgctgg cacaactggg gcgacgcccg catcgagggc 300 

aacgcccttg ttcgtgatcg ctgggtgcag gaactctccc accccatcta tctccacgcg 360 

gcgcgcgacg gatcttgtcc gctgccgcca atcaccgccc tcccatccgc gaccccgtcg 420 

ctccagaccg tgttccagga ccatttcctg atgggtgctg ccttgaatgt gaggcagttc 480 

accgaaaacg acgcaaccaa gaccgctctc atcaagaagc aattcaacac catcacgccc 540 

gagaatgttc tcaagtgggg gccggttcat cctgagccca accggttcaa cttcgaatcc 600 

accgatcgtt acgtggactt tggtgtgaag aaccggatgt tcatcgtcgg ccacaccctc 660 

gtctggcacc accagacacc cgcctgggtg tttcaagatt cccaaggcca gccgctcgac 720 

cgggatggac tgctcaatcg cttgagcaac cacatccaca cggtggttgg acgctacaag 780 

ggccgcatcc acgggtggga tatggtgaac gaggccttga acgatgacgg caccctccgc 840 

cctagccaat ggcttaaaat catcggcccc gactacattg ccaaagcgtt tgcccttgcc 900 

cacgccgccg accctgccgc tgaactgtat tacaacgatt acagtctcga tcatcccgcc 960 

aagtgtgctg gtgcgatcgc gctggtgaag cagctccaga cgaatggcat atccattgcc 1020 

gggattggca cgcagaccca cgtcggactc aacggacctt ccccccagtc ggtggatgat 1080 

tcattgacgg cctttggcca gctcggcgtg aaggtcatgg ttaccgaact cgacgttgat 1140 

gtgctgcccg ccgccagcca aaatcaaaac gcggatctca accagcccgc cttgtccaat 1200 

cccgccctca atcccgccct caatccctat cccgatgggc tgccgcaagc cgtccaggac 1260 

aaactggccg ctcgctatgc ggaactcttc gccgtgttcg tcaagcacgc cgacaaaatc 1320 

agccgcgtca cgttctggtg cgtcaccgac ggcgactcct ggctgaacaa ctggcccgtg 1380 

cgtggccgcg tcaactatcc gctgctgttc gaccgtgcca gccagcccaa gcccgccttc 1440 

gatgcggtca ttcgcgtcgc caaggacccg ccgacggttt cgcacaatct caccccgctc 1500 

cacgatgcgg cgcgggtcct ggtcaatccg cacaagggct ggtaccacca ctacccggac 1560 

aatcacatca acaagtatga gatcgcccgc gatgccgacc tgacggaa 1608 
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<210> 322 
<211> 536 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 322 

Met Asp Trp Trp Asp Val Asp lie Phe ser Ala Lys Glu lie Thr His 
15 10 15 

Pro Gin Leu Ala Thr Phe Leu Asp Ala Ser Arg Asp His Arg Lys Pro 

20 25 " 30 

Val Met lie Gly Glu Met Thr Pro Arg His Val Gly Val lie Glu Gly 

35 40 45 

Gin Lys cys Trp Asp Glu Trp Phe Gly pro Met lie Asp Leu Leu Lys 

50 55 60 

Arg Arg Pro Glu lie Lys Ala Thr Ala Tyr lie Asn Trp Glu Trp Arg 
65 70 75 80 

Glu Trp ser Asp Arg Leu Gly Phe Arg Trp His Asn Trp Gly Asp Ala 

85 90 95 

Arg lie Glu Gly Asn Ala Leu Val Arg Asp Arg Trp Val Gin Glu Leu 

100 105 110 

Ser His Pro lie Tyr Leu His Ala Ala Arg Asp Gly ser cys Pro Leu 

115 120 125 

Pro Pro lie Thr Ala Leu Pro Ser Ala Thr Pro ser Leu Gin Thr Val 

130 135 140 

Phe Gin Asp His Phe Leu Met Gly Ala Ala Leu Asn Val Arg Gin Phe 
145 150 155 160 

Thr Glu Asn Asp Ala Thr Lys Thr Ala Leu lie Lys Lys Gin Phe Asn 

4. , ■ ^ 5 175 

Thr lie Thr Pro Glu Asn Val Leu Lys Trp Gly Pro Val His Pro Glu 

180 185 190 

Pro Asn Arg Phe Asn Phe Glu Ser Thr Asp Arg Tyr val Asp Phe Gly 

195 200 205 

val Lys Asn Arg Met Phe lie Val Gly His Thr Leu Val Trp His His 

210 215 220 

Gin Thr Pro Ala Trp val phe Gin Asp ser Gin Gly Gin Pro Leu Asp 
225 230 235 240 

Arg Asp Gly Leu Leu Asn Arg Leu ser Asn His lie His Thr Val Val 

245 250 255 

Gly Arg Tyr Lys Gly Arg lie His Gly Trp Asp Met val Asn Glu Ala 

260 265 270 

Leu Asn Asp Asp Gly Thr Leu Arg Pro Ser Gin Trp Leu Lys lie lie 

275 280 285 

Gly Pro Asp Tyr lie Ala Lys Ala Phe Ala Leu Ala His Ala Ala Asp 

290 295 300 

Pro Ala Ala Glu Leu Tyr Tyr Asn Asp Tyr Ser Leu Asp His Pro Ala 
305 310 315 320 

Lys cys Ala Gly Ala lie Ala Leu val Lys Gin Leu Gin Thr Asn Gly 

325 330 335 

lie ser lie Ala Gly lie Gly Thr Gin Thr His Val Gly Leu Asn Gly 

340 345 350 

Pro ser Pro Gin Ser Val Asp Asp Ser Leu Thr Ala phe Gly Gin Leu 

355 360 365 

Gly val Lys Val Met Val Thr Glu Leu Asp Val Asp val Leu Pro Ala 

370 375 380 

Ala ser Gin Asn Gin Asn Ala Asp Leu Asn Gin Pro Ala Leu Ser Asn 
385 390 395 400 

pro Ala Leu Asn pro Ala Leu Asn Pro Tyr Pro Asp Gly Leu Pro Gin 

405 410 415 

Ala val Gin Asp Lys Leu Ala Ala Arg Tyr Ala Glu Leu Phe Ala val 

420 425 430 

Phe Val Lys His Ala Asp Lys lie ser Arg Val Thr Phe Trp cys Val 

435 440 " 445 

Thr Asp Gly Asp Ser Trp Leu Asn Asn Trp Pro Val Arg Gly Arg Val 

450 455 460 

Asn Tyr Pro Leu Leu Phe Asp Arg Ala ser Gin Pro Lys Pro Ala Phe 
465 470 475 480 
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Asp Ala val lie Arg Val Ala Lys Asp Pro Pro Thr Val Ser His Asn 

485 490 495 

Leu Thr pro Leu His Asp Ala Ala Arg val Leu val Asn pro His Lys 

500 505 510 

Gly Trp Tyr His His Tyr Pro Asp Asn His lie Asn Lys Tyr Glu He 

515 520 525 

Ala Arg Asp Ala Asp Leu Thr Glu 
530 535 

<210> 323 
<211> 2355 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 323 

atgatgctca atgcccgttg tatccaactt atgaagttgt tgcttcgctc ttctctttat 60 

cttaccgctg acaaattggc gcaatcattg aatgtatcca agcgaacgat ttattacgat 120 

atacaaaaaa cgaatgaatg gttgcatcat gaagggctga agccgattca atatgcgcgc 180 

gggctcggat ttcgcttgga tgatgaagtg aaacaagaaa taacaacaaa gtggaacaca 240 

ttacaacctg cccgacatta cacatatcag tcatgggagc gaaaagcttg gattggttta 300 

tggattttga ctcgcgttca tccactgtat ttgtctgatt ttttagagaa attacatgta 360 

agcaggagca cgttgttaaa tgacataaag gaactgaaag aagattggca gtcatttcag 420 

ttgcgattgt cattccatcg caaaaaaggg tatttttcat caggggaaga aatccaaaaa 480 

aggaaattga tgattcgtta tattcatcaa atattagcgg cgatggatga ccagcatttc 540 

gctgcagaat tgtcagctga gtgtcaatgg ccaatctttg attggatttg ccaattcgag 600 

tctacttttt ctattcgcta taccggtgag gttattcaaa ctttacctat ttacctcgca 660 

ttgttccaaa gacggtgggc tagaggcaaa tttgtgcaaa tggacgagca agaaaaagaa 720 

gtgctaaggt caatgcggga ataccagatt gctgatcatc tcgttagacg aattgaaaac 780 

gtttccgaaa tatctattcc cgatgacgag gtttgttatt tgacgaccca tttactcagt 840 

tttcgagttg cagatgacaa gcaaatcgat cataacgatg acatcactac tttgaaacga 900 

atcattcgac atatggtgga tgattttcaa acttatgcct gtgtacaatt caagcgtcgc 960 

gaagagttgg aaaaaaattt attggttcat atgaagcctg cctattatcg actgaaatac 1020 

ggttttcatc tgcaaaacga tctgaccgaa tcggtcaaag cgaactatca agatttattt 1080 

accttaacga aaaaagtcgt ccatcattta gaaagtgtag ttggccagcc ggtcagcgac 1140 

gatgaaattg cttatatcgc catgcatttt ggcggatggt tggacagaga gggggtgtcg 1200 

gttccagtac ggaaaaaggt gttgatcgtc tgcgagagcg ggattggaac atcgcgaatg 1260 

ttgcaaaaac aattggatca acgctacaaa aacgagttca cgattggtgc ggcagtagaa 1320 

ccttatcaac tacaaaatga aaaagacgta caaatgctaa agcgccactt caacagcatt 1380 

gttgccgaga acgtaatgaa accgatcagc attcaacctg aggaaggaaa attcaatttt 1440 

gaacaagcgg atcgaattgt gaagttcgct aaggcaaatg gcatggatat tcgcttccat 1500 

acactcgttt ggcacagcca agtacctcaa tggttctttc ttgacaagga aggcaagcca 1560 

atggttaatg aaacagatcc agtgaaacgt gaacaaaata aacaactgct gttaaaacga 1620 

cttgaaactc atattaaaac gatcgtcgag cggtacaaag atgacattaa gtactgggac 1680 

gttgtaaatg aggttgtggg ggacgacgga aaactgcgca actctccatg gtatcaaatc 1740 

gccggcatcg attatattaa agtggcattc caaacagcga gaaaatatgg cggcaacaag 1800 

attaaacttt atatcaatga ttacaatacc gaagtggaac caaagcgaag cgctctttat 1860 

aacttggtga agcaattaaa agaagagggc gttcctattg acggcatcgg ccatcaatcc 1920 

cacattcaaa tcggctggcc ttctgaagca gaaatcgaga aaacgattaa catgttcgcc 1980 

gctctcggct tagacaacca aatcactgag cttgatgtga gcatgtacgg , ttggccgccg 2040 

cgcgcttacc cgacgtatga cgccattcca aaacaaaagt ttttggatca ggcagcgcgc 2100 

tatgatcgtt tgttcaaact gtatgaaaag ttgagcgata aaattagcaa cgtcaccttc 2160 

tggggcatcg ccgacaatca tacgtggctc gacagccgtg cggatgtgta ctatgacgcc 2220 

aacgggaatg ttgtggttga cccgaacgct ccgtacgcaa aagtggaaaa agggaaagga 2280 

aaagatgcgc cgttcgtttt tggaccggat tacaaagtca aacccgcata ttgggctatt 2340 

atcgaccaca aatag 2355 

<210> 324 
<211> 784 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 324 

Met Met Leu Asn Ala Arg Cys lie Gin Leu Met Lys Leu Leu Leu Arg 
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Ser ser Leu 

Ser Lys Arg 
35 

His His Glu 
50 

Arg Leu Asp 
65 

Leu Gin Pro 

Trp lie Gly 

Asp Phe Leu 
115 

lie Lys Glu 

130 
Phe His Arg 
145 

Arg Lys Leu 

Asp Gin His 

Phe Asp Trp 
195 

Gly Glu Val 

210 
Arg Trp Ala 

225 

val Leu Arg 

Arg lie Glu 

Tyr Leu Thr 
275 

lie Asp His 

290 
Met val Asp 
305 

Glu Glu Leu 

Arg Leu Lys 

Lys Ala Asn 
355 

His Leu Glu 

370 
Tyr lie Ala 
385 

val Pro Val 

Thr ser Arg 

Phe Thr lie 
435 

Asp Val Gin 

450 
val Met Lys 
465 

Glu Gin Ala 

lie Arg Phe 

Phe Leu Asp 
515 

Lys Arg Glu 

lie Lys Thr 
545 



Tyr Leu 
20 

Thr lie 

Gly Leu 

Asp Glu 

Ala Arg 

85 
Leu Trp 
100 

Glu Lys 

Leu Lys 

Lys Lys 

Met lie 
165 
Phe Ala 
180 

lie cys 

He Gin 

Arg Gly 

ser Met 
245 
Asn val 
260 

Thr His 

Asn Asp 

Asp Phe 

Glu Lys 
325 
Tyr Gly 
340 

Tyr Gin 

Ser val 

Met His 

Arg Lys 
405 
Met Leu 
420 

Gly Ala 
Met Leu 
Pro lie 



Asp An 
48 
His Thr 
500 

Lys Glu 
Gin Asn 
lie val 



10 15 
Thr Ala Asp Lys Leu Ala Gin Ser Leu Asn Val 

25 30 
Tyr Tyr Asp He Gin Lys Thr Asn Glu Trp Leu 

40 45 
Lys Pro lie Gin Tyr Ala Arg Gly Leu Gly Phe 

55 60 
val Lys Gin Glu lie Thr Thr Lys Trp Asn Thr 
70 75 80 

His Tyr Thr Tyr Gin ser Trp Glu Arg Lys Ala 

90 95 
lie Leu Thr Arg Val His Pro Leu Tyr Leu Ser 

105 110 
Leu His val Ser Arg ser Thr Leu Leu Asn Asp 

120 125 
Glu Asp Trp Gin Ser Phe Gin Leu Arg Leu Ser 

135 140 
Gly Tyr Phe Ser Ser Gly Glu Glu lie Gin Lys 
150 155 160 

Arg Tyr lie His Gin He Leu Ala Ala Met Asp 

170 175 
Ala Glu Leu ser Ala Glu Cys Gin Trp Pro lie 

185 190 
Gin Phe Glu Ser Thr Phe Ser lie Arg Tyr Thr 

200 205 
Thr Leu Pro lie Tyr Leu Ala Leu Phe Gin Arg 

215 220 
Lys Phe Val Gin Met Asp Glu Gin Glu Lys Glu 
230 235 240 

Arg Glu Tyr Gin lie Ala Asp His Leu Val Arg 

250 255 
Ser Glu lie Ser lie Pro Asp Asp Glu Val Cys 

265 270 
Leu Leu Ser Phe Arg val Ala Asp Asp Lys Gin 

280 285 
Asp lie Thr Thr Leu Lys Arg lie lie Arg His 

295 300 
Gin Thr Tyr Ala cys val Gin Phe Lys Arg Arg 
310 315 320 

Asn Leu Leu Val His Met Lys Pro Ala Tyr Tyr 

330 335 
Phe His Leu Gin Asn Asp Leu Thr Glu Ser Val 

345 350 
Asp Leu Phe Thr Leu Thr Lys Lys Val Val His 

360 365 
Val Gly Gin Pro val ser Asp Asp Glu lie Ala 

375 380 
Phe Gly Gly Trp Leu Asp Arg Glu Gly Val Ser 
390 395 400 

Lys Val Leu lie Val cys Glu ser Gly He Gly 

410 415 
Gin Lys Gin Leu Asp Gin Arg Tyr Lys Asn Glu 

425 430 
Ala Val Glu Pro Tyr Gin Leu Gin Asn Glu Lys 

440 445 
Lys Arg His Phe Asn Ser He Val Ala Glu Asn 

455 460 
Ser lie Gin Pro Glu Glu Gly Lys Phe Asn Phe 
470 475 * 480 

lie val Lys Phe Ala Lys Ala Asn Gly Met Asp 

490 495 
Leu Val Trp His Ser Gin Val Pro Gin Trp Phe 

505 510 
Gly Lys Pro Met Val Asn Glu Thr Asp Pro Val 

520 525 
Lys Gin Leu Leu Leu Lys Arg Leu Glu Thr His 

535 540 
Glu Arg Tyr Lys Asp Asp lie Lys Tyr Trp Asp 
550 555 560 
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val val Asn Glu Val Val Gly Asp Asp Gly Lys Leu Arg Asn ser Pro 

565 570 ' 575 

Trp Tyr Gin He Ala Gly lie Asp Tyr lie Lys val Ala Phe Gin Thr 

580 585 590 

Ala Arg Lys Tyr Gly Gly Asn Lys He Lys Leu Tyr lie Asn Asp Tyr 

595 600 605 

Asn Thr Glu Val Glu Pro Lys Arg Ser Ala Leu Tyr Asn Leu Val Lys 

610 615 620 

Gin Leu Lys Glu Glu Gly Val Pro He Asp Gly lie Gly His Gin Ser 
625 630 635 640 

His lie Gin lie Gly Trp pro Ser Glu Ala Glu He Glu Lys Thr lie 

645 650 655 

Asn Met Phe Ala Ala Leu Gly Leu Asp Asn Gin lie Thr Glu Leu Asp 

660 665 670 

Val Ser Met Tyr Gly Trp Pro Pro Arg Ala Tyr Pro Thr Tyr Asp Ala 

675 680 685 

lie Pro Lys Gin Lys Phe Leu Asp Gin Ala Ala Arg Tyr Asp Arg Leu 

690 695 700 

Phe Lys Leu Tyr Glu Lys Leu ser Asp Lys lie Ser Asn val Thr Phe 
705 710 715 720 

Trp Gly He Ala Asp Asn His Thr Trp Leu Asp ser Arg Ala Asp val 

725 730 735 

Tyr Tyr Asp Ala Asn Gly Asn Val Val Val Asp Pro Asn Ala Pro Tyr 

740 745 750 

Ala Lys val Glu Lys Gly Lys Gly Lys Asp Ala Pro Phe Val Phe Gly 

755 760 765 

Pro Asp Tyr Lys Val Lys Pro Ala Tyr Trp Ala lie lie Asp His Lys 
770 775 780 

<210> 325 
<211> 1146 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 325 

atgactattt cccgccggaa atttatgtgg ggcacagctg cactcctggc caccacccag 60 

ctcaaaaccc gcgctctcgc cgctgccatg gccagcacag gcatcaagga cgccttcaag 120 

ggcgacttcc atatcggcac cgccatcagc aacgctaccc tgcaaaacca ggatgccacc 180 

atgctggatt tgatcaagcg cgaatttaat gcaattaccg ctgaaaattg catgaagtgg 240 

gagcctattc gcccacagct ggatcagtgg aattgggagc tggccgaccg ctttgtggat 300 

ttcggcgtta aaaacaagat gtatgtggta ggtcacacgc tgatttggca cagccaggcg 360 

ccagcgcaca tttatctcga cgccgatggt aagcccaaca gtcgcgatgc ccagttgaaa 420 

gtaatggagg agcacatacg taccctggcg ggccgctaca aaggaaagat agacgcctgg 480 

gacgtggtta acgaagcagt ggaggatgat ggcagctggc gtcaaaccgg ctggtacaaa 540 

aacatgggtg aagaatatat cgcccatgcc ttccgcttgg cagccgaggt agaccccaac 600 

gccaagctac tctacaacga ctacaacgag gctgtacccg ccaagcgtga tgcgattatt 660 

cgggtggtaa aaggcgtgca gaaggctggc gcacccattc acggtgtggg gatgcaaggg 720 

cacatgagcc tgtcacatcc ggatttcgcg gagttcgaaa aatccataat cgaatacgcc 780 

aagttggggg tgaaggtgca cgttaccgaa ctggatatcg acgtgttgcc actggcgtgg 840 

aacctgagtg cggaaatttc caatcgcttt gaataccgcc cagagatgga tccttatcgc 900 

gaaggtttgc ccgccaaagt cgaggaggag ctagcggctc gttacgaggc gctgtttaaa 960 

atcctgctgc gtcatcgcga caaaattgag cgtgtgacca cttggggcac caacgactca 1020 

gagacctggt taaatggctt ccccattccg gggcgcatga attacccaat gctgttcgat 1080 

cgtaataacc agcccaagtt ggcctatcac cggctgctgg cactcaaaca aaagaaaagt 1140 

cagtaa 1146 

<210> 326 
<211> 381 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (l).-.(27) 
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<400> 326 

Met Thr lie Ser Arg Arg Lys Phe Met Trp Gly Thr Ala Ala Leu Leu 

1 5 10 15 

Ala Thr Thr Gin Leu Lys Thr Arg Ala Leu Ala Ala Ala Met Ala Ser 

20 25 30 

Thr Gly lie Lys Asp Ala Phe Lys Gly Asp Phe His lie Gly Thr Ala 

35 40 45 

lie ser Asn Ala Thr Leu Gin Asn Gin Asp Ala Thr Met Leu Asp Leu 

50 55 60 

He Lys Arg Glu Phe Asn Ala He Thr Ala Glu Asn Cys Met Lys Trp 
65 70 75 80 

Glu Pro lie Arg Pro Gin Leu Asp Gin Trp Asn Trp Glu Leu Ala Asp 

85 90 95 

Arg Phe Val Asp Phe Gly Val Lys Asn Lys Met Tyr Val Val Gly His 

100 105 110 

Thr Leu lie Trp His Ser Gin Ala Pro Ala His He Tyr Leu Asp Ala 

115 120 125 

Asp Gly Lys Pro Asn ser Arg Asp Ala Gin Leu Lys Val Met Glu Glu 

130 135 140 

His lie Arg Thr Leu Ala Gly Arg Tyr Lys Gly Lys lie Asp Ala Trp 
145 150 155 160 

Asp val Val Asn Glu Ala Val Glu Asp Asp Gly Ser Trp Arg Gin Thr 

165 170 175 

Gly Trp Tyr Lys Asn Met Gly Glu Glu Tyr lie Ala His Ala Phe Arg 

180 185 190 

Leu Ala Ala Glu val Asp pro Asn Ala Lys Leu Leu Tyr Asn Asp Tyr 

195 200 205 

Asn Glu Ala Val Pro Ala Lys Arg Asp Ala lie He Arg val val Lys 

210 215 220 

Gly Val Gin Lys Ala Gly Ala Pro He His Gly Val Gly Met Gin Gly 
225 230 235 240 

His Met ser Leu ser His Pro Asp Phe Ala Glu Phe Glu Lys Ser He 

„ „ 245 250 255 

lie Glu Tyr Ala Lys Leu Gly Val Lys val His Val Thr Glu Leu Asp 

260 265 270 

lie Asp val Leu Pro Leu Ala Trp Asn Leu ser Ala Glu lie ser Asn 

275 280 285 

Arg Phe Glu Tyr Arg pro Glu Met Asp Pro Tyr Arg Glu Gly Leu Pro 

290 295 300 

Ala Lys val Glu Glu Glu Leu Ala Ala Arg Tyr Glu Ala Leu Phe Lys 
305 310 315 320 

lie Leu Leu Arg His Arg Asp Lys lie Glu Arg Val Thr Thr Trp Gly 

325 330 335 

Thr Asn Asp Ser Glu Thr Trp Leu Asn Gly Phe Pro lie Pro Gly Arg 

340 345 350 

Met Asn Tyr Pro Met Leu Phe Asp Arg Asn Asn Gin Pro Lys Leu Ala 

355 360 365 

Tyr His Arg Leu Leu Ala Leu Lys Gin Lys Lys ser Gin 
370 375 380 

<210> 327 
<211> 1500 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 327 

atgaaacgtt cagtctctat ctttatcgca tgtttagtaa tgacagtatt aacaattagc 60 

ggtgtcgcgg caccagaagc atctgcagca ggggcgaaaa cgcctgtagc ccttaatggc 120 

cagcttagca ttaaaggtac tcagctagtc aatcaaaacg gaaaatcggt gcagctgaag 180 

gggatcagct cacacggttt gcagtggttc ggcgattatg tcaataaaga ctctttaaaa 240 

tggctaagag acgattgggg aattaccgtc ttccgagcgg caatgtacac ggctgaaggc 300 

ggttatatag agaatccgtc tgtgaaaaat aaagtcaaag aagctgttga agcggcaaaa 360 

gagctcggga tatatgtcat cattgactgg catattttaa atgacggcaa tccaaatcaa 420 

aataaagaga aggcgaagga attctttaag gaaatgtcga gcctttacgg aagcacacca 480 

aacgttattt atgaaattgc taatgaaccg aacggtgatg taaattggaa gcgcgatatc 540 
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aaaccgtatg cggaggaagt gatttccgtt atccgtaaaa atgacccgga taacatcatt 600 

attaccggaa ctggcacttg gagtcaggat gtcaatgatg ctgctgatga tcagcttaag 660 

gatgcaaacg tcatgtacgc gcttcatttt tatgcaggta cacacggcca gtatttaagg 720 

gataaagccg attatgcgct cagcaaagga gcgccgattt ttgtaacgga atgggggacg 780 

agtgacgctt ccggaaatgg cggggtcttc cttgaccagt cgagggaatg gctgaattat 840 

ctcgacaaca agaaaatcag ctgggtaaac tggaaccttt ctgataagca ggaatcttcc 900 

tcagctttaa agccgggggc atctaaaaca ggcggctggc cgttatcaga tttatccgct 960 

tcagggacat ttgtaaggga aaagatccgt ggctcccaac attcgactga agacagatct 1020 

gagacaccaa agcaagataa acccgtacag gaaaacagcc tatctgtgca atacagaaca 1080 

ggggatggaa gtgtgaacag caaccaaatc cgtcctcaga tccatgtgaa aaacaacagc 1140 

aagaccaccg ttaatttaaa aaatgtaact gtccgctact ggtataacac gaaaaacaaa 1200 

ggccaaaact tcgactgtga ctacgcgaag atcggatgca gcaatgtgac gcacaagttt 1260 

gtgacattac aaaaacctgt aaaaggtgca gatgcctatc tggaacttgg gtttaaaaac 1320 

gggacactgt caccgggagc aaacactgga gaaatccaaa ttcgtcttca caatgaggat 1380 

tggggcaatt attcacaaat cggggattat tctttttctc agtcaaatac gtttaaagat 1440 

acaaaaaaaa tcacattata taataacgga aaactaattt ggggaactga acccaaatag 1500 

<210> 328 
<211> 499 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> CD... (29) 

<400> 328 

Met Lys Arg Ser Val ser lie Phe lie Ala cys Leu val Met Thr val 

1 5 10 15 

Leu Thr lie Ser Gly val Ala Ala Pro Glu Ala Ser Ala Ala Gly Ala 

20 25 30 

Lys Thr Pro Val Ala Leu Asn Gly Gin Leu ser He Lys Gly Thr Gin 

35 40 45 

Leu val Asn Gin Asn Gly Lys Ser Val Gin Leu Lys Gly lie Ser Ser 

50 55 60 

His Gly Leu Gin Trp Phe Gly Asp Tyr val Asn Lys Asp ser Leu Lys 
65 70 75 80 

Trp Leu Arg Asp Asp Trp Gly He Thr Val Phe Arg Ala Ala Met Tyr 

85 90 95 

Thr Ala Glu Gly Gly Tyr He Glu Asn Pro Ser Val Lys Asn Lys Val 

100 105 110 

Lys Glu Ala val Glu Ala Ala Lys Glu Leu Gly He Tyr Val He lie 

115 120 125 

Asp Trp His lie Leu Asn Asp Gly Asn Pro Asn Gin Asn Lys Glu Lys 

130 135 140 

Ala Lys Glu Phe Phe Lys Glu Met Ser ser Leu Tyr Gly ser Thr Pro 
145 150 155 160 

Asn val lie Tyr Glu He Ala Asn Glu pro Asn Gly Asp Val Asn Trp 

165 170 175 

Lys Arg Asp lie Lys pro Tyr Ala Glu Glu val lie ser val He Arg 

180 185 190 

Lys Asn Asp Pro Asp Asn lie lie lie Thr Gly Thr Gly Thr Trp Ser 

195 200 205 

Gin Asp Val Asn Asp Ala Ala Asp Asp Gin Leu Lys Asp Ala Asn val 

210 215 220 

Met Tyr Ala Leu His Phe Tyr Ala Gly Thr His Gly Gin Tyr Leu Arg 
225 230 235 240 

Asp Lys Ala Asp Tyr Ala Leu ser Lys Gly Ala Pro lie Phe Val Thr 

B 245 250 255 

Glu Trp Gly Thr ser Asp Ala Ser Gly Asn Gly Gly val Phe Leu Asp 

260 K 265 270 

Gin ser Arg Glu Trp Leu Asn Tyr Leu Asp Asn Lys Lys He ser Trp 

275 280 285 

Val Asn Trp Asn Leu ser Asp Lys Gin Glu Ser Ser ser Ala Leu Lys 

290 295 300 

Pro Gly Ala Ser Lys Thr Gly Gly Trp pro Leu ser Asp Leu ser Ala 
305 310 315 320 
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ser Gly Thr Phe val Arg Glu Lys lie Arg Gly ser Gin His Ser Thr 

325 330 335 

Glu Asp Arg ser Glu Thr pro Lys Gin Asp Lys Pro val Gin Glu Asn 

340 345 350 

Ser Leu ser val Gin Tyr Arg Thr Gly Asp Gly Ser Val Asn ser Asn 

355 360 365 

Gin lie Arg Pro Gin He His Val Lys Asn Asn Ser Lys Thr Thr Val 

370 375 380 

Asn Leu Lys Asn val Thr Val Arg Tyr Trp Tyr Asn Thr Lys Asn Lys 
385 390 395 400 

Gly Gin Asn Phe Asp Cys Asp Tyr Ala Lys lie Gly Cys ser Asn Val 

405 410 415 

Thr His Lys Phe Val Thr Leu Gin Lys Pro Val Lys Gly Ala Asp Ala 

420 425 430 

Tyr Leu Glu Leu Gly Phe Lys Asn Gly Thr Leu Ser Pro Gly Ala Asn 

435 440 445 

Thr Gly Glu He Gin lie Arg Leu His Asn Glu Asp Trp Gly Asn Tyr 

450 455 460 

Ser Gin lie Gly Asp Tyr ser Phe Ser Gin ser Asn Thr Phe Lys Asp 
465 470 475 480 

Thr Lys Lys lie Thr Leu Tyr Asn Asn Gly Lys Leu lie Trp Gly Thr 
485 490 495 

Glu Pro Lys 

<210> 329 
<211> 2268 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 329 

atgaggaacg ttcaggaaat aggaggcagt atgtacaaaa aggcttttct tgtactggca 60 

ttgtttttgc tgttggcggc ggtggcgctc ccgtctgtgg gggctgcgcc gcaggggccg 120 

cgcctgcgcg atgtggcggg cgacatttta gtgggttacg cctccagaaa cgatttctgg 180 

aacatgtctg actcagccca atacacagaa gttgcccgca ctgagttcaa cttcatgacg 240 

cccgaaaacg ccatgaagtg ggacgccatt catcccgcgc aaaactcata cagttttgcc 300 

caggccgacc ggcacgtgca gtttgcccag gccaacaaca tggccgtgca tggacatgcc 360 

ctcgtgtggc acagccaaaa tccaggctgg ctgaccaatg gcaactggtc ccgcagccaa 420 

ttgatcaaca tcatgaacga ccacattgac acggtcgccg gccgttatgc aggtgaggtg 480 

ctggtgtggg acgtggtcaa tcaggcgttt aatgaggatg gaacttatcg cagcaccatc 540 

tggtacaacg ggatcggaca ggaatatatc gacctggcct ttacccgcgc ccgcgccgcc 600 

gatcctcatg ccaaactcat ttacaacgat tacaacattg gctggttaaa cagtaagtcg 660 

aatggcgtct acaacatggc cgccgatatg gtcaggcgcg gtgtgcccat cgacggcgtt 720 

ggtttccaga tgcacctgga acggggcggc gtcagcggca gcagtctggc gagcaacatg 780 

cagcgtttcg ccgatttggg attggaagtt tacatcaccg aattggacgt gcgcattccc 840 

caaaacccaa cccagcagga tttgcaggct caggcggcag tttaccaaac ggtgacgaat 900 

cgctgtttgg cgcagcctgc ctgcaaggcg ttgcaggtct ggggcatccc cgacaaatat 960 

tcctgggtac cggacgtatt ccccggcacg ggcgcgcctc tgttgtttaa cgacaactat 1020 

gaggccaaac ccgcctatta tgccgtccag gcagagttga tggccgcgaa tccgcagccc 1080 

acaaacacac cgggaacgcc cgctcatacc ccttcggcca cgtctacgtc tgcggccact 1140 

gctacgcccc cggcaacggc cacggcgacc gccaccaccc cctccggcgg cggcgtttgc 1200 

gccgttgatt acgtcattgc caaccagtgg ggcaatggct ttcaggccaa cgtcaccatc 1260 

accaatcaca gcgccgcgcc ggtgaacggc tataccctgg cctggaccca cgcgccgggg 1320 

cagattgtca ccagcggctg gaacgtaacc atcgcccaaa gcggcagcgc cgtcagcgcc 1380 

agcaacccgg ccggttattg gaacggtgtg atcggagcca acggcggcaa gatttctttt 1440 

ggtttccagg gatctctggc gggcggcagc gcggtcgcgc ccacttattt tgccttgaac 1500 

ggcgctgcct gtaacggggc cgtccttccg cctactgcca ccttcacgcc ttcaccgacg 1560 

gctaccatgt gtccccaggc aacgcctgaa ctgcttgtcg tgcagccggt gacttcaccc 1620 

actacccaac tgtctcaaac gctggtggtg cgtttaggca acggcgaatg ggtgcgcgct 1680 

gccggaccgg caggcgttgt caccgtcact gcgccggacc cggatggtta tttccgcctg 1740 

acgataccgc tggcagccaa taccagcaac gccattctgg tagaagggcg ggtgcgggtt 1800 

atcacccatt caaatggctg cacctatggc ggttatacct tgagcagaac cgtaacgatt 1860 

gtgcaagcca gcagcccagt caccttaacg ccgactgcca caccttcccc caccgccacg 1920 

gcaacgccta cggtaaccgc cacgtcgccg tcaggcgcct gcaccgtcgc ctacgccatc 1980 

accaacgact ggggcagcgg tttcaccgcc aacgttaccc tcaccaatac tggcggaagc 2040 

gccctcaacg gctggaccct ggcctatgcc tttcccggca atcaaaccat cagcaacgcc 2100 
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tggaacggaa cggccgttca gtccggcagc agcgtcagcg tcaccaacgc cggttggaat 2160 
ggcagcctgc cgcccaacgt ctccgccagc tttggcttcc aggcgagcta cagcggcaat 2220 
aacagcgtcc ctgccagctt tacgctgaac ggcgcgcttt gccattga 2268 

<210> 330 
<211> 755 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (I)... (35) 

<400> 330 

Met Arg Asn Val Gin Glu lie Gly Gly Ser Met Tyr Lys Lys Ala Phe 
15 10 15 

Leu val Leu Ala Leu Phe Leu Leu Leu Ala Ala val Ala Leu Pro ser 

20 25 30 

Val Gly Ala Ala Pro Gin Gly Pro Arg Leu Arg Asp val Ala Gly Asp 

35 40 45 

lie Leu Val Gly Tyr Ala Ser Arg Asn Asp Phe Trp Asn Met Ser Asp 

50 55 60 

Ser Ala Gin Tyr Thr Glu Val Ala Arg Thr Glu Phe Asn Phe Met Thr 
65 70 75 80 

Pro Glu Asn Ala Met Lys Trp Asp Ala lie His Pro Ala Gin Asn ser 

85 90 95 

Tyr Ser Phe Ala Gin Ala Asp Arg His val Gin Phe Ala Gin Ala Asn 

100 105 110 

Asn Met Ala Val His Gly His Ala Leu Val Trp His Ser Gin Asn Pro 

115 " 120 125 

Gly Trp Leu Thr Asn Gly Asn Trp ser Arg Ser Gin Leu lie Asn lie 

130 135 140 

Met Asn Asp His lie Asp Thr Val Ala Gly Arg Tyr Ala Gly Glu Val 
145 150 155 160 

Leu Val Trp Asp Val Val Asn Gin Ala Phe Asn Glu Asp Gly Thr Tyr 

165 170 175 

Arg ser Thr He Trp Tyr Asn Gly lie Gly Gin Glu Tyr lie Asp Leu 

180 185 190 

Ala Phe Thr Arg Ala Arg Ala Ala Asp Pro His Ala Lys Leu lie Tyr 

195 ~ 200 205 

Asn Asp Tyr Asn lie Gly Trp Leu Asn Ser Lys Ser Asn Gly val Tyr 

210 215 220 

Asn Met Ala Ala Asp Met Val Arg Arg Gly Val Pro lie Asp Gly Val 
225 230 235 240 

Gly Phe Gin Met His Leu Glu Arg Gly Gly Val Ser Gly Ser Ser Leu 

245 250 255 

Ala Ser Asn Met Gin Arg Phe Ala Asp Leu Gly Leu Glu val Tyr lie 

260 265 270 

Thr Glu Leu Asp Val Arg lie Pro Gin Asn Pro Thr Gin Gin Asp Leu 

275 280 285 

Gin Ala Gin Ala Ala Val Tyr Gin Thr Val Thr Asn Arg cys Leu Ala 

290 295 300 

Gin pro Ala cys Lys Ala Leu Gin Val Trp Gly lie pro Asp Lys Tyr 
305 310 315 320 

ser Trp Val Pro Asp Val Phe Pro Gly Thr Gly Ala pro Leu Leu Phe 

325 330 335 

Asn Asp Asn Tyr Glu Ala Lys Pro Ala Tyr Tyr Ala val Gin Ala Glu 

340 345 350 

Leu Met Ala Ala Asn Pro Gin Pro Thr Asn Thr Pro Gly Thr Pro Ala 

355 360 365 

His Thr Pro ser Ala Thr Ser Thr ser Ala Ala Thr Ala Thr Pro Pro 

370 375 380 

Ala Thr Ala Thr Ala Thr Ala Thr Thr Pro Ser Gly Gly Gly val Cys 
385 390 395 400 

Ala Val Asp Tyr val lie Ala Asn Gin Trp Gly Asn Gly Phe Gin Ala 

405 410 415 

Asn Val Thr He Thr Asn His Ser Ala Ala Pro Val Asn Gly Tyr Thr 
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420 425 430 

Leu Ala Trp Thr His Ala Pro Gly Gin lie val Thr Ser Gly Trp Asn 

435 440 445 

val Thr lie Ala Gin ser Gly ser Ala val ser Ala ser Asn Pro Ala 

450 455 460 

Gly Tyr Trp Asn Gly val lie Gly Ala Asn Gly Gly Lys lie ser Phe 
465 470 475 480 

Gly Phe Gin Gly Ser Leu Ala Gly Gly ser Ala val Ala Pro Thr Tyr 

, , 485 490 495 

Phe Ala Leu Asn Gly Ala Ala Cys Asn Gly Ala Val Leu Pro Pro Thr 

n . . 5 P° 505 510 

Ala Thr Phe Thr Pro ser Pro Thr Ala Thr Met Cys Pro Gin Ala Thr 

, 515 520 525 

Pro Glu Leu Leu val Val Gin Pro val Thr ser Pro Thr Thr Gin Leu 

530 535 540 

Ser Gin Thr Leu Val Val Arg Leu Gly Asn Gly Glu Trp val Arg Ala 
545 550 555 560 

Ala Gly Pro Ala Gly val val Thr Val Thr Ala Pro Asp Pro Asp Gly 

565 570 H 575 

Tyr Phe Arg Leu Thr lie Pro Leu Ala Ala Asn Ttir ser Asn Ala He 

n 580 585 590 

Leu val Glu Gly Arg Val Arg val lie Thr His Ser Asn Gly cys Thr 

595 600 605 

Tyr Gly Gly Tyr Thr Leu Ser Arg Thr val Thr He Val Gin Ala ser 

610 615 620 

Ser Pro Val Thr Leu Thr Pro Thr Ala Thr Pro Ser Pro Thr Ala Thr 
625 630 635 640 

Ala Thr pro Thr val Thr Ala Thr Ser pro ser Gly Ala Cys Thr val 

, n 645 650 655 

Ala Tyr Ala lie Thr Asn Asp Trp Gly ser Gly Phe Thr Ala Asn Val 

660 665 670 

Thr Leu Thr Asn Thr Gly Gly Ser Ala Leu Asn Gly Trp Thr Leu Ala 

675 680 685 

Tyr Ala Phe Pro Gly Asn Gin Thr lie Ser Asn Ala Trp Asn Gly Thr 

690 695 700 

Ala val Gin Ser Gly ser ser val Ser val Thr Asn Ala Gly Trp Asn 
705 710 715 720 

Gly ser Leu Pro Pro Asn Val Ser Ala ser Phe Gly Phe Gin Ala Ser 

725 730 735 

Tyr ser Gly Asn Asn Ser Val Pro Ala Ser Phe Thr Leu Asn Gly Ala 
740 745 750 

Leu cys His 
755 

<210> 331 
<211> 1242 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 331 

gtgttcaagg gcttgcgcta tttgctgttg ctgtgcctga gtgcgggact ggtctttgcc 60 

tgtgcgccac ggtctgtgac cgccccaccc gatgggctaa gcgggcaaat taggctcctg 120 

cgccaaggaa ccctcactgt ccttgtccag aatgcccaag ggcaacccat tgccaacgcc 180 

aaggtggtag ctgctcagca aacccatgcc ttcccctttg gtgttgcctt agatacagca 240 

atgtttgagc cttccccgcc acccgcagcc aactggtacc gcaacaccgc tcgccaaaat 300 

tttaatgccg ctgtccatga aaacgccctc aagtggtatg cccttgaacc ggagcagggc 360 

aagctggact ttacgatggc ggatcgcatc ctcgcttgga gtgaagccca aggctggccg 420 

atgcgggggc acaccctctt ttgggaagtt gagcaattta accccccatg gctgaaaacg 480 

ctgccaccag agcaactgcg ggctgccgtc aagaaccatg ccatgacggt gtgtcgccat 540 

taccgcgggc gaatcaatga atttgatgtc aataatgaaa tgctccacgg taactttttc 600 

cgcagtcgtt tgggaaacgg catagttaaa gagatgttcg agtggtgccg cgagggtaac 660 

cccgaggccg tcctttatgt gaacgactac ggcattattg agggcgatcg cctcgacgac 720 

tacgtgcagc agattcgcga tttactgggg caaggggttc ccattggtgg cattggcatt 780 

caagcccatt tggaatatcc cttggatgca gccaagatga aacgcgccct tgataccctt 840 

gcccaattca acctgcccct aaaaatcact gaagttagtg tcagccttgc cgacgagcag 900 

cagcaggcgg agacactgcg ccaaatctac cgcattggtt ttgcccatcc agccgtcaaa 960 
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gagatcctcc tgtggggatt ttgggaaggc aaccactggc gaccccaagc aggactgtac 1020 

cgtcgcgact tttccgccaa acctgctgcc gaagcctatc gacaactcct ctttcaggag 1080 

tggtggacca ccagcaacgg caaaactaat gccgatgggc gctggcagac ccgcggctat 1140 

gcggggcgct atcgcctcac agtaacggcc aacggccaga ccattaaccg cgacattgac 1200 

ctaccagact tggagagaac cgtgaccgta caattcccat ga 1242 

<210> 332 
<211> 413 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(28) 

<400> 332 

Met Phe Lys Gly Leu Arg Tyr Leu Leu Leu Leu Cys Leu Ser Ala Gly 

1 5 10 15 

Leu val Phe Ala Cys Ala Pro Arg Ser val Thr Ala Pro Pro Asp Gly 

20 25 30 

Leu ser Gly Gin lie Arg Leu Leu Arg Gin Gly Thr Leu Thr Val Leu 

, 35 40 45 

Val Gin Asn Ala Gin Gly Gin Pro lie Ala Asn Ala Lys val val Ala 

50 55 60 

Ala Gin Gin Thr His Ala Phe Pro Phe Gly val Ala Leu Asp Thr Ala 
65 70 75 80 

Met Phe Glu Pro Ser Pro Pro Pro Ala Ala Asn Trp Tyr Arg Asn Thr 

85 90 95 

Ala Arg Gin Asn Phe Asn Ala Ala val His Glu Asn Ala Leu Lys Trp 

100 105 110 

Tyr Ala Leu Glu Pro Glu Gin Gly Lys Leu Asp Phe Thr Met Ala Asp 

115 120 125 

Arg lie Leu Ala Trp Ser Glu Ala Gin Gly Trp Pro Met Arg Gly His 

130 135 140 

Thr Leu Phe Trp Glu Val Glu Gin Phe Asn pro Pro Trp Leu Lys Thr 
145 150 155 160 

Leu Pro pro Glu Gin Leu Arg Ala Ala val Lys Asn His Ala Met Thr 

. 165 170 175 

Val Cys Arg His Tyr Arg Gly Arg lie Asn Glu Phe Asp val Asn Asn 

180 185 190 

Glu Met Leu His Gly Asn Phe Phe Arg Ser Arg Leu Gly Asn Gly lie 

195 200 205 

val Lys Glu Met Phe Glu Trp Cys Arg Glu Gly Asn Pro Glu Ala Val 

210 215 220 

Leu Tyr Val Asn Asp Tyr Gly lie lie Glu Gly Asp Arg Leu Asp Asp 
225 230 235 240 

Tyr Val Gin Gin lie Arg Asp Leu Leu Gly Gin Gly val Pro lie Gly 

, , 245 250 255 

Gly lie Gly He Gin Ala His Leu Glu Tyr Pro Leu Asp Ala Ala Lys 

260 265 270 

Met Lys Arg Ala Leu Asp Thr Leu Ala Gin Phe Asn Leu Pro Leu Lys 

lie Thr Glu val Ser val ser Leu Ala Asp Glu Gin Gin Gin Ala Glu 

290 295 300 

Thr Leu Arg Gin lie Tyr Arg lie Gly Phe Ala His Pro Ala Val Lys 
305 310 315 320 

Glu He Leu Leu Trp Gly Phe Trp Glu Gly Asn His Trp Arg Pro Gin 

, , 325 330 H y 335 

Ala Gly Leu Tyr Arg Arg Asp Phe Ser Ala Lys Pro Ala Ala Glu Ala 

340 345 350 

Tyr Arg Gin Leu Leu Phe Gin Glu Trp Trp Thr Thr ser Asn Gly Lys 

355 360 365 

Thr Asn Ala Asp Gly Arg Trp Gin Thr Arg Gly Tyr Ala Gly Arg Tyr 

370 375 380 

Arg Leu Thr Val Thr Ala Asn Gly Gin Thr lie Asn Arg Asp lie Asp 
385 390 395 400 

Leu pro Asp Leu Glu Arg Thr val Thr Val Gin Phe Pro 
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405 410 

<210> 333 
<211> 1152 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 333 

atgaaaagac aatttattgg acgattgaga cttgtcacta tcctttcaat catagtgatt 60 

atgggatgtg cttcaaacaa aagtgatcag aatgttgata acctaaagga cgccttcgac 120 

ggtttgttcc ttattggaac tgccatgaat accccccaga tcaccggaca ggatacccgg 180 

acgcttgaat tgatcaaaaa acacatgaac tccattgtgg cagaaaacgt tatgaaaagc 240 

ggactaatac agcccagcga aggggagttc gacttctcac ttgccgacca gtttgtgcaa 300 

ttcggtgttg acaacaacat gcacatcgta gggcataccc ttatctggca ttcgcaggct 360 

ccagggtggt tttttgtgga tgaaaacggt aatgatgtta gtcccgaagt tcttaagcaa 420 

aggatgaaag accacatcta cacagtagtt ggccgttaca aaggcaaagt gcacggttgg 480 

gatgtggtga atgaatgtat cgttgacgat gggtcatggc gcaacagcaa gttttaccag 540 

atcctgggtg aagactttgt aaagtatgcc ttccagtttg cttcagaagc cgacccgaat 600 

gctgaattgt attacaacga ttattccatg gcacttcccg gccgccgcca gggagtcgta 660 

aacatggtaa aaaatctaca ggcacaaggt attaaaattg acggaatagg aatgcagggc 720 

cacctgatga tcgaccatcc atcccttgaa gatttcgaaa ccagtttgct tgcctttgcc 780 

gatctgggtg tacatgttat gatcactgag cttgatgtat ctgtacttcc ttttcctacc 840 

cgcaacctcg gtgctgatgt atctctaaac atagcttaca acactgaact gaacccctat 900 

cccgatggat tgcctgatga tgtggcccaa aaacttcatg atcgctggct cgatatatat 960 

cgtttattta taaaacatca cgacaagatc acccgtgtta ctacctgggg tacagccgat 1020 

ggtatgtcat ggaagaacaa ctggcccatt cgtggacgca cagactttcc tttattattc 1080 

gaccgcgatt ttcaacccaa accggtagta gctgatatta tcaaagaagc attggctgca 1140 

aagagaaaat ag 1152 

<210> 334 

<211> 383 

<212> PRT 

<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(30) 

<400> 334 

Met Lys Arg Gin Phe lie Gly Arg Leu Arg Leu Val Thr lie Leu Ser 

15 10 15 

lie lie val lie Met Gly Cys Ala Ser Asn Lys Ser Asp Gin Asn Val 

20 25 30 

Asp Asn Leu Lys Asp Ala Phe Asp Gly Leu Phe Leu lie Gly Thr Ala 

35 40 45 

Met Asn Thr Pro Gin lie Thr Gly Gin Asp Thr Arg Thr Leu Glu Leu 

50 55 60 

lie Lys Lys His Met Asn Ser He Val Ala Glu Asn Val Met Lys Ser 
65 70 75 80 

Gly Leu lie Gin Pro ser Glu Gly Glu Phe Asp Phe ser Leu Ala Asp 

85 90 95 . 

Gin Phe val Gin Phe Gly Val Asp Asn Asn Met His lie val Gly His 

100 105 110 

Thr Leu lie Trp His ser Gin Ala Pro Gly Trp Phe phe Val Asp Glu 

115 120 125 

Asn Gly Asn Asp val Ser Pro Glu val Leu Lys Gin Arg Met Lys Asp 

130 135 140 

His lie Tyr Thr val Val Gly Arg Tyr Lys Gly Lys val His Gly Tr, 
145 150 155 16 

Asp val val Asn Glu cys lie val Asp Asp Gly Ser Trp Arg Asn Ser 

165 170 , 175 

Lvs Phe Tyr Gin lie Leu Gly Glu Asp Phe val Lys Tyr Ala Phe Gin 

180 185 190 

Phe Ala Ser Glu Ala Asp Pro Asn Ala Glu Leu Tyr Tyr Asn Asp Tyr 
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195 200 205 

ser Met Ala Leu Pro Gly Arg Arg Gin Gly val Val Asn Met Val Lys 

210 215 220 

Asn Leu Gin Ala Gin Gly lie Lys He Asp Gly lie Gly Met Gin Gly 
225 230 235 240 

His Leu Met lie Asp His pro ser Leu Glu Asp Phe Glu Thr ser Leu 

245 250 255 

Leu Ala Phe Ala Asp Leu Gly Val His Val Met lie Thr Glu Leu Asp 

260 265 270 

Val Ser Val Leu Pro Phe Pro Thr Arg Asn Leu Gly Ala Asp Val Ser 

275 280 285 

Leu Asn lie Ala Tyr Asn Thr Glu Leu Asn Pro Tyr Pro Asp Gly Leu 

290 295 . 300 

Pro Asp Asp val Ala Gin Lys Leu His Asp Arg Trp Leu Asp He Tyr 
305 310 315 320 

Arg Leu Phe lie Lys His His Asp Lys lie Thr Arg val Thr Thr Trp 

325 330 335 

Gly Thr Ala Asp Gly Met Ser Trp Lys Asn Asn Trp Pro lie Arg Gly 

340 345 350 

Arg Thr Asp Phe Pro Leu Leu Phe Asp Arg Asp Phe Gin pro Lys Pro 

355 360 365 

Val Val Ala Asp lie lie Lys Glu Ala Leu Ala Ala Lys Arg Lys 
370 375 380 

<210> 335 
<211> 849 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 335 

atgattccaa ggatcgtcct ggccgtccgc atatccccta cttttctcag cccacaaaaa 60 

ggggtaataa aaatgataaa gcgggctttt atgataaccc tggcggcctt cctcctcctt 120 

ttcgccctaa attccctgcc tatccatgcc ggggccgaag gcggggagga aaagtttacc 180 

cccaaggtca tcgtggagca cggtttcgag aataacgact tccacggttg ggtcccccgg 240 

ggcggggtcg ggaccatttc cattaccaat gaggcggccc atagcgggtc ctcctgcctg 300 

aagatcaccg gccggactca agcttggcat atgccgcggg tggagatcac caagtactta 360 

gaaaagggag ctaagtataa gatcgaattg tacgtcaagc tccccgcggg cacctcgccg 420 

cgcaagttcc agctggcggt tctcacccgt tatctcgaag gcaaccagac cagggacaaa 480 

gaggactcca tctcggacga ggtggaggtg accgccgata cctggaccaa ggtcgagggc 540 

gagtacgtct tcgacccggc ggccatcggc gcctacgtct acccctacct caagggcgac 600 

cccgcagggg cctatgcccc ctatctcatc gatgatttca agatcaccac gatcgccccc 660 

gcccccaaga agaccgccgc taccgccgcg gcaaaagagg cagaagagcc cttaatcgag 720 

accgatatac catccttaaa agacgtctgc gcgtcctact tcgagatcgg cgcggccatc 780 

gagccatatg agttattctc caagccccac gatcagctgc tccggaaaca tttcaacacc 840 

gttggttga 849 

<210> 336 
<211> 282 
<212> PRT 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(50) 

<400> 336 

Met lie pro Arg lie val Leu Ala val Arg lie Ser Pro Thr Phe Leu 

15 10 15 

Ser pro Gin Lys Gly Val lie Lys Met lie Lys Arg Ala Phe Met He 

20 25 30 

Thr Leu Ala Ala Phe Leu Leu Leu Phe Ala Leu Asn ser Leu Pro He 

35 40 45 

His Ala Gly Ala Glu Gly Gly Glu Glu Lys Phe Thr Pro Lys Val He 
50 55 60 
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val Glu His Gly Phe Glu Asn Asn Asp phe His Gly Trp Val Pro Arg 
65 70 75 80 

Gly Gly val Gly Thr lie ser lie Thr Asn Glu Ala Ala His ser Gly 

85 90 95 

ser Ser Cys Leu Lys lie Thr Gly Arg Thr Gin Ala Trp His Met Pro 

100 105 110 

Arg val Glu lie Thr Lys Tyr Leu Glu Lys Gly Ala Lys Tyr Lys lie 

115 120 125 

Glu Leu Tyr Val Lys Leu Pro Ala Gly Thr Ser Pro Arg Lys Phe Gin 

130 135 * 140 

Leu Ala val Leu Thr Arg Tyr Leu Glu Gly Asn Gin Thr Arg Asp Lys 
145 150 155 160 

Glu Asp Ser lie Ser Asp Glu Val Glu Val Thr Ala Asp Thr Trp Thr 

165 170 175 

Lys Val Glu Gly Glu Tyr Val Phe Asp pro Ala Ala lie Gly Ala Tyr 

180 185 190 

Val Tyr Pro Tyr Leu Lys Gly Asp Pro Ala Gly Ala Tyr Ala Pro Tyr 

195 200 205 

Leu lie Asp Asp Phe Lys lie Thr Thr lie Ala Pro Ala Pro Lys Lys 

210 215 220 

Thr Ala Ala Thr Ala Ala Ala Lys Glu Ala Glu Glu Pro Leu lie Glu 
225 230 235 240 

Thr Asp lie Pro ser Leu Lys Asp Val cys Ala Ser Tyr Phe Glu lie 

245 250 255 

Gly Ala Ala lie Glu Pro Tyr Glu Leu Phe Ser Lys Pro His Asp Gin 

260 265 270 

Leu Leu Arg Lys His Phe Asn Thr val Gly 
275 280 

<210> 337 
<211> 870 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 337 

atgaagcccg acagcgtgct ggatgtaaac gccagcaaaa agctctccgc ccaggatgaa 60 

accgccgtgg cggtgaaatt cgacgccgcc cgcgccctgc tggattttgt caaggaaaac 120 

gggctcaagg tgcacggtca cgtgctggta tggcattccc agacgccgga agccttcttc 180 

cacgagggct atgatgccgc caggccctac gtggggcggg acgtgatgct ggggcgcatg 240 

aaaaactaca tcaaggccgt gtttgaatac actgagacca attaccccgg cgtcatcgtc 300 

tcctgggacg tagtgaacga agccatcgac gacggcacca acaagctgcg ccagtccaac 360 

tggttcaaaa ccgttggcga ggatttcgtg ctccgcgcct ttgaatacgc caggaaatac 420 

gcccccgaag gcacgctgct ttattacaac gattacaaca ccgccatgcc cggcaagctg 480 

aacggcatcg ccaatctgct caaagccctc atcgccgagg gcaacatcga cggctacggc 540 

ttccaaatgc accacagcgt gggcttcccc tccatggaaa tgatttccgc gtctgtggag 600 

cgcatcgccg gcatgggcct taagctccgg gtcagcgaat tggacgtggg caccgacgga 660 

aacaccgaaa gcagcttcac caagcaggcg gaaaaatacg ccgccatcat gcggctgctg 720 

ctggattata aggatcaaat ggaagccgtg caggtatggg gcctcaccga cgatatgagc 780 

tggcgccggg ccaactatcc cctgctcttc gacggcaaat tcaaccccaa gcccgccttc 840 

tacgccgtgg ctgacccata cgcaaaataa 870 

<210> 338 
<211> 289 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 338 

Met Lys Pro Asp ser Val Leu Asp val Asn Ala Ser Lys Lys Leu ser 

1 5 10 15 

Ala Gin Asp Glu Thr Ala Val Ala Val Lys Phe Asp Ala Ala Arg Ala 

20 25 30 

Leu Leu Asp Phe val Lys Glu Asn Gly Leu Lys val His Gly His val 
35 40 45 
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Leu Val Trp His Ser Gin Thr Pro Glu Ala Phe Phe His Glu Gly Tyr 

50 55 60 

Asp Ala Ala Arg pro Tyr val Gly Arg Asp val Met Leu Gly Arg Met 
65 70 75 80 

Lys Asn Tyr He Lys Ala Val Phe Glu Tyr Thr Glu Thr Asn Tyr Pro 

85 90 95 

Gly Val lie Val ser Trp Asp Val Val Asn Glu Ala lie Asp Asp Gly 

100 105 110 

Thr Asn Lys Leu Arg Gin Ser Asn Trp Phe Lys Thr Val Gly Glu Asp 

115 120 125 

Phe val Leu Arg Ala Phe Glu Tyr Ala Arg Lys Tyr Ala Pro Glu Gly 

130 ~ 135 140 

Thr Leu Leu Tyr Tyr Asn Asp Tyr Asn Thr Ala Met pro Gly Lys Leu 
145 150 155 160 

Asn Gly lie Ala Asn Leu Leu Lys Ala Leu lie Ala Glu Gly Asn lie 

165 170 175 

Asp Gly Tyr Gly Phe Gin Met His His Ser Val Gly Phe Pro Ser Met 

180 185 190 

Glu Met He Ser Ala ser Val Glu Arg lie Ala Gly Met Gly Leu Lys 

195 200 205 

Leu Arg Val Ser Glu Leu Asp Val Gly Thr Asp Gly Asn Thr Glu Ser 

210 215 220 

Ser Phe Thr Lys Gin Ala Glu Lys Tyr Ala Ala lie Met Arg Leu Leu 
225 230 235 240 

Leu Asp Tyr Lys Asp Gin Met Glu Ala Val Gin Val Trp Gly Leu Thr 

245 250 255 

Asp Asp Met Ser Trp Arg Arg Ala Asn Tyr Pro Leu Leu Phe Asp Gly 

260 265 270 

Lys Phe Asn Pro Lys Pro Ala Phe Tyr Ala Val Ala Asp Pro Tyr Ala 
275 280 285 

Lys 

<210> 339 
<211> 1125 
<212> DNA 
<213> Unknown 

<220> 

<2 2 3> Obtained from an environmental sample. 
<400> 339 

atgcctatgg agcgacccac tttcttgcgg tttcttgcct tttttcttct ttttaccatg 60 

attttcgccg ccggagggtg ccgacccctt gccccttcac ggatggagat cgagacggat 120 

atcccctccc tcaaggaagt cgccgcttct tatttcgaga tcggcgcggc cgtcgagccg 180 

tatcagttat cctctccacc ccacgatgcc cttctgcgga aacattttaa ctgcctcgtg 240 

gcggagaacg tcatgaagcc cgcctccatc cagccatcgg aggggtattt caactggacc 300 

gaagcggaca agatcgtgaa ctacgccaaa gcccacggga tgaagctccg cttccatacc 360 

ctcgtctggc ataatcaggt cccggattgg ttcttcgcgg gtaacgacaa aacccgcctt 420 

ttgcagcgct tggagaatca tatccggact atcattaaaa gatatggcga taaggtcgac 480 

tattgggacg tggtgaacga agtaatagac gacaacggcg gtatgcgaaa cagcaagtgg 540 

taccagatca ccgggaagga ctacatcaag accgccttcc gggtggcaga cgacgagctc 600 

aggaagaatg ggtggaggaa agaaggtcgt cagctctata tcaacgacta caacacccat 660 

aacccaacga agagagaggg gatctggcgc ttgatccaag agctccgggc ggaagggatt 720 

cccgtcgacg gagtaggcca ccagacgcat atcaatatcg aatggccgcc cgtaagccag 780 

atcgtggaat cgatccgctt cttcggcgaa aaaggcctcg ataaccaggt gaccgagctg 840 

gatgtgagca tctatacgaa tgacaaggat tcacatggta gttatcaggc catcccgcag 900 

gaagtcttca tcaagcaggg taatcgctac aaggaactct ttgaagggct aaaaagtgta 960 

aaaaactacc tcagcaacgt caccttctgg ggcatggcgg acgatcatac ctggctgaac 1020 

cgttggccca tcgaacggcc cgatgctcct cttcctttcg atatctatct caaggccaag 1080 

ccggcgtatt gggggatcgt ggatgctttg aagctttcgc ggtga 1125 

<210> 340 
<211> 374 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
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<221> SIGNAL 
<222> (1)...(23) 

<400> 340 

Met Pro Met Glu Arg Pro Thr Phe Leu Arg Phe Leu Ala Phe Phe Leu 
1 5 10 15 

Leu Phe Thr Met lie Phe Ala Ala Gly Gly cys Arg Pro Leu Ala Pro 

20 25 30 

Ser Arg Met Glu lie Glu Thr Asp lie Pro Ser Leu Lys Glu Val Ala 

35 40 45 

Ala Ser Tyr Phe Glu lie Gly Ala Ala Val Glu Pro Tyr Gin Leu Ser 

50 55 60 

Ser Pro pro His Asp Ala Leu Leu Arg Lys His Phe Asn Cys Leu Val 
65 70 75 80 

Ala Glu Asn val Met Lys pro Ala ser lie Gin Pro ser Glu Gly Tyr 

85 90 95 

Phe Asn Trp Thr Glu Ala Asp Lys lie Val Asn Tyr Ala Lys Ala His 

100 105 110 

Gly Met Lys Leu Arg Phe His Thr Leu Val Trp His Asn Gin Val Pro 

115 120 125 

Asp Trp Phe Phe Ala Gly Asn Asp Lys Thr Arg Leu Leu Gin Arg Leu 

130 135 140 

Glu Asn His lie Arg Thr lie lie Lys Arg Tyr Gly Asp Lys Val Asp 
145 150 155 160 

Tyr Trp Asp Val Val Asn Glu val lie Asp Asp Asn Gly Gly Met Arg 

165 170 175 

Asn ser Lys Trp Tyr Gin He Thr Gly Lys Asp Tyr He Lys Thr Ala 

180 185 190 

Phe Arg val Ala Asp Asp Glu Leu Arg Lys Asn Gly Trp Arg Lys Glu 

195 200 205 

Gly Arg Gin Leu Tyr lie Asn Asp Tyr Asn Thr His Asn Pro Thr Lys 

210 215 220 

Arg Glu Gly lie Trp Arg Leu He Gin Glu Leu Arg Ala Glu Gly He 
225 230 235 240 

Pro val Asp Gly val Gly His Gin Thr His lie Asn lie Glu Trp Pro 

245 250 255 

Pro val ser Gin lie Val Glu Ser lie Arg Phe Phe Gly Glu Lys Gly 

260 265 270 

Leu Asp Asn Gin Val Thr Glu Leu Asp Val Ser lie Tyr Thr Asn Asp 

275 280 285 

Lys Asp ser His Gly ser Tyr Gin Ala lie Pro Gin Glu Val Phe lie 

290 295 300 

Lys Gin Gly Asn Arg Tyr Lys Glu Leu Phe Glu Gly Leu Lys ser val 
305 310 315 320 

Lys Asn Tyr Leu Ser Asn Val Thr Phe Trp Gly Met Ala Asp Asp His 

325 330 335 

Thr Trp Leu Asn Arg Trp Pro lie Glu Arg Pro Asp Ala pro Leu Pro 

340 345 ~ 350 

Phe Asp lie Tyr Leu Lys Ala Lys Pro Ala Tyr Trp Gly lie Val Asp 

355 360 365 

Ala Leu Lys Leu Ser Arg 
370 

<210> 341 
<211> 1347 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 341 

atgacaatta acaacaaaac tacagcgagt cctagtattc ccagcaccca caattccctc 60 
ccgtcgcttc gcacactgtt taccaccagc ctgctcacgc tggccctgac cgcctgcggt 120 
ggttcttcca gcagcgacaa ggacccttca agctccagct ccagtgaatc atcaagttcc 180 
agcgaatcct cgagctcagc ttccagcgaa tcctcgagca gtgagtccag cagtagctct 240 
tccgcgggcc atttctccat cgagccggac ttccagctct acagcctggc caacttcccg 300 
gtgggcgtgg cggtctccgc cgccaacgag aacgacagca tcttcaacag tccggatgcc 360 
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gccgaacgtc aggccgttat tattgagcac ttctctcagc tcaccgccgg caacatcatg 420 

aaaatgagct acctgcagcc gagtcaaggc aacttcacct tcgatgacgc cgacgagttg 480 

gttaacttcg cccaagccaa tggcatgacc gtacacggcc actccaccat ctggcacgcg 540 

gactaccaag taccgaactt catgagaaac tttgaaggtg accaggagga atgggcagaa 600 

attctgaccg atcacgtcac taccatcatc gagcacttcc ccgacgatgt ggtcatcagc 660 

tgggacgtgg tgaacgaggc tgtcgatcaa ggcacggcga acggctggcg ccattcggtg 720 

ttctacaatg cattcgacgc cccggaagaa ggcgacattc ccgaatacat caaagtcgct 780 

ttccgcgccg cgcgcgaggc tgacgccaac gtagacctct actacaacga ctacgacaat 840 

accgccaatg cccagcgcct ggccaaaaca ctgcaaattg ccgaggtact ggacgccgaa 900 

ggcaccattg acggcgtcgg tttccagatg cacgcctaca tggattaccc gagcctgacc 960 

cattttgaaa acgccttccg gcaagtcgtc gacctggggc tcaaagtgaa agttaccgag 1020 

ctggacgtat ccgtagtcaa cccctacggc ggcgaagcac ctccacaacc ggaatacgac 1080 

aaagaactgg ccggcgcgca aaaactgcgc ttctgccaaa tcgccgaagt ttacatgaac 1140 

actgtacccg aggagttacg cggtggcttc accgtctggg gcctgaccga tgatgaaagt 1200 

tggctgatgc aacagttcag aaacgccacc ggcgccgact acgacgacgt ctggccgtta 1260 

ctgttcaatg ccgacaaatc cgccaaaccg gcactgcaag gcgtggccga cgcctttacc 1320 

ggacaaacct gcacctccga gttctaa *" 1347 

<210> 342 

<211> 448 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> CO...C45) 

<400> 342 

Met Thr He Asn Asn Lys Thr Thr Ala Ser Pro Ser lie Pro Ser Thr 

15 10 15 

His Asn Ser Leu Pro Ser Leu Arg Thr Leu Phe Thr Thr ser Leu Leu 

20 25 30 

Thr Leu Ala Leu Thr Ala Cys Gly Gly Ser Ser Ser ser Asp Lys Asp 

35 40 45 

Pro Ser Ser Ser ser Ser Ser Glu Ser Ser Ser Ser ser Glu Ser Ser 

50 55 60 

ser ser Ala Ser Ser Glu Ser ser ser ser Glu Ser Ser ser Ser Ser 
65 70 75 80 

ser Ala Gly His Phe ser lie Glu Pro Asp Phe Gin Leu Tyr ser Leu 

85 90 95 

Ala Asn Phe Pro Val Gly Val Ala Val Ser Ala Ala Asn Glu Asn Asp 

100 105 110 

Ser lie Phe Asn ser Pro Asp Ala Ala Glu Arg Gin Ala Val lie lie 

115 120 125 

Glu His Phe ser Gin Leu Thr Ala Gly Asn He Met Lys Met Ser Tyr 

130 135 140 

Leu Gin Pro ser Gin Gly Asn phe Thr Phe Asp Asp Ala Asp Glu Leu 
145 150 155 160 

val Asn Phe Ala Gin Ala Asn Gly Met Thr Val His Gly His Ser Thr 

165 170 175 

He Trp His Ala Asp Tyr Gin Val Pro Asn Phe Met Arg Asn Phe Glu 

180 185 190 

Gly Asp Gin Glu Glu Trp Ala Glu lie Leu Thr Asp His Val Thr Thr 

195 200 205 

lie lie Glu His Phe pro Asp Asp val val lie ser Trp Asp val val 

210 215 220 

Asn Glu Ala Val Asp Gin Gly Thr Ala Asn Gly Trp Arg His Ser Val 
225 230 235 240 

Phe Tyr Asn Ala Phe Asp Ala Pro Glu Glu Gly Asp He Pro Glu Tyr 

245 250 255 

He Lys val Ala Phe Arg Ala Ala Arg Glu Ala Asp Ala Asn Val Asp 

260 265 270 

Leu Tyr Tyr Asn Asp Tyr Asp Asn Thr Ala Asn Ala Gin Arg Leu Ala 

275 K K 280 285 

Lys Thr Leu Gin lie Ala Glu Val Leu Asp Ala Glu Gly Thr He Asp 

290 295 300 

Gly Val Gly Phe Gin Met His Ala Tyr Met Asp Tyr Pro ser Leu Thr 
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305 

His Phe Glu Asn 

Lys val Thr Glu 
340 

Ala Pro Pro Gin 
355 

Leu Arg phe Cys 
370 

Glu Leu Arg Gly 
385 

Trp Leu Met Gin 



310 
Ala Phe 
325 

Leu Asp 

Pro Glu 

Gin lie 

Gly Phe 
390 
Gin Phe 
405 

Leu Phe 



Arg Gin Val 

val ser val 
345 

Tyr Asp Lys 

360 
Ala Glu val 
375 

Thr Val Trp 

Arg Asn Ala 

Asn Ala Asp 
425 

Phe Thr Gly 
440 



315 
Val Asp 
330 

val Asn 

Glu Leu 

Tyr Met 

Gly Leu 
395 
Thr Gly 
410 

Lys ser 
Gin Thr 



Leu Gly 

Pro Tyr 

Ala Gly 
365 
Asn Thr 
380 

Thr Asp 

Ala Asp 

Ala Lys 

Cys Thr 
445 



Val Trp Pro Leu 
420 

Gin Gly Val Ala Asp Ala 
435 

<210> 343 
<211> 2217 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 343 

atggtggagc acgaagctga gcttcatgat taccgcgaac gaatatgcga 
caattcgctg ggcctggagg gcgagaaggc aaagcggatt ttgcgggaaa 
gctgttttga agatgaaatt atcgagaggg agggggcatg gcgttttgaa 
ctgattgggc ttattgcagc gatattgggc tgcgcggcgc tgcttatgca 
tcatccctgg gaattcagct gagatcatgg ctcagaggga gcgacaatgt 
tgggaaaagg attggaagac ggctgctaac gagcaaatcg agcagctccg 
gtggagatcg aggtcgtcga tctgaacgga aacccgctgc ctggggctac 
gttcagcgca cgcatcagtt tggcttcggc accgccatca accgaacggc 
ccggtgtacg ccgattttgt caaaaaccgt ttcgaatggg tgaccttcga 
aagtggctct ggaatgaggc cgtacaaggg cgggtctatt atcgggaggc 
ctcgaatttg ccaggcaaaa cgggctgaag gtgcgcggac ataatctgtt 
gagaaatatc agccgcagtg ggtgaagagt ctgacgggcg ctgcgctgaa 
gataaccggc tgaacagcgc cgtcctgcat tttaagggca attttctgca 
aacaacgaaa tgtttcacgg cagcttcttc aaggatcgcc tgggggaaga 
tatatgtata agcgaacccg ggaactcgat cccggcgtca agctgttcgt 
aattttatcg agtacccgcc ggagcgggat tataaccagg tcattcaagc 
cgggggatgc cgattgacgg catcggcgcg caagggcatt ttaacggagt 
ttgttcgtta agggaagact ggataagctg gctgagctga atctgccgat 
gaattcgatt ccacgcataa ggacgagaga gtccgtgccg ataatctgga 
cggctggcgt tcgcccatcc ggcggtcgaa gggattgtca tgtggggctt 
tcccattgga agggcactga cggcgcgatc gtgaatcaag actggacgct 
ggacagcgat accagcagct tatggatgaa tggacgacgg tcgtcgaagg 
cagcgcggca tgttttcgtt ccgggggttc cacggaacct acgatatgct 
cctggagcgg cggctgtgaa gcagtccttt accttggagc ccggctctgg 
ctgcacattc cgttcgacgt tcaggacaag tccatcccgg aggctcctgc 
gccgctgccg cggattccca ggttatgctg agctggagca aggtcaacgg 
tatacggtta aaagcgcggt cagcgccgac ggtccctata cgccgattgc 
ctcaccgaga ccttcacgca catcggtcta gtgaaccgga aagattatta 
agcgccagca accatctagg tgagagcccg gattccgccc cgatccgggc 
gccgcgggcg agttacaaac gaatctcgtg cttcagtacc gctccgctga 
aactatcaaa tgaagcctca gttcacgatc aagaacgcag gcaaagtgcc 
agcgagctga cgatccgcta ctatttcacg ccggagagca cgcagccggt 
atcgactggg cccaattcgg agcagagcat gtccagacga cggtcgttcc 
gccgcggcgc acgcctatgt cgagctcagc ttcctggagt cggcaggggc 
gatacgacat taggcaatat tcagctgcgc atctttaaca gcgatggctc 
aaaacgaacg attattcctt cgacccgacg aaaaaggctt atacggcgtg 
acgctttatc ggaacgggga actggtttgg gggatagagc cttggggcgc 

<210> 344 
<211> 738 
<212> PRT 
<213> unknown 



320 

Leu Lys Val 

335 
Gly Gly Glu 
350 

Ala Gin Lys 

Val Pro Glu 

Asp Glu Ser 
400 

Tyr Asp Asp 

415 
pro Ala Leu 
430 

Ser Glu Phe 



agttgcttgg 
tcaacaactg 
acgagcaggt 
caatgagatc 
gaacgctagc 
caagcgcaat 
cgttcgcgcg 
gttgagcaat 
gaacgaggcc 
cgatcagctg 
ctgggaggcg 
ggaagcgatc 
ctgggacgtc 
aatctggacc 
caacgattac 
gctcatcgat 
catcgatccc 
ctggattacc 
gaagatgtat 
ctgggcgggc 
caatgccgcc 
cacgaccgat 
ggtcgattac 
caatgcgaag 
caagctcagc 
ggcaaccggc 
ccatcagctg 
ttacgtggtg 
cactccgcgt 
tggagataac 
catcccgtta 
ggataccagg 
gccatccgat 
catcccttcc 
ttcgttcgat 
ggagaaggtc 
gaagtaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2217 
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<220> 



<223> Obtained from an environmental sample. 
<400> 344 

Met val Glu His Glu Ala Glu Leu His Asp Tyr Arg Glu Arg lie cys 

1 5 10 15 

Glu Val Ala Trp Gin Phe Ala Gly Pro Gly Gly Arg Glu Gly Lys Ala 

20 25 " 30 

Asp Phe Ala Gly Asn Gin Gin Leu Ala Val Leu Lys Met Lys Leu Ser 

35 40 45 

Arg Gly Arg Gly His Gly val Leu Lys Arg Ala Gly Leu lie Gly Leu 

50 55 60 

lie Ala Ala lie Leu Gly Cys Ala Ala Leu Leu Met His Asn Glu lie 
65 70 75 80 

Ser Ser Leu Gly lie Gin Leu Arg ser Trp Leu Arg Gly Ser Asp Asn 

85 90 95 

val Asn Ala Ser Trp Glu Lys Asp Trp Lys Thr Ala Ala Asn Glu Gin 

100 105 110 

lie Glu Gin Leu Arg Lys Arg Asn Val Glu He Glu Val val Asp Leu 

115 120 125 

Asn Gly Asn Pro Leu pro Gly Ala Thr val Arg Ala val Gin Arg Thr 

130 135 140 

His Gin Phe Gly Phe Gly Thr Ala lie Asn Arg Thr Ala Leu Ser Asn 
145 150 155 160 

Pro Val Tyr Ala Asp Phe val Lys Asn Arg Phe Glu Trp val Thr Phe 

165 170 175 

Glu Asn Glu Ala Lys Trp Leu Trp Asn Glu Ala Val Gin Gly Arg val 

180 185 190 

Tyr Tyr Arg Glu Ala Asp Gin Leu Leu Glu Phe Ala Arg Gin Asn Gly 

195 200 205 

Leu Lys Val Arg Gly His Asn Leu Phe Trp Glu Ala Glu Lys Tyr Gin 

210 215 220 

Pro Gin Trp Val Lys Ser Leu Thr Gly Ala Ala Leu Lys Glu Ala lie 
225 230 235 240 

Asp Asn Arg Leu Asn ser Ala Val Leu His Phe Lys Gly Asn Phe Leu 

245 250 255 

His Trp Asp val Asn Asn Glu Met Phe His Gly Ser Phe Phe Lys Asp 

260 265 270 

Arg Leu Gly Glu Glu lie Trp Thr Tyr Met Tyr Lys Arg Thr Arg Glu 

275 280 285 

Leu Asp Pro Gly val Lys Leu Phe val Asn Asp Tyr Asn Phe lie Glu 

290 295 300 

Tyr Pro Pro Glu Arg Asp Tyr Asn Gin val lie Gin Ala Leu He Asp 
305 310 315 320 

Arg Gly Met Pro lie Asp Gly lie Gly Ala Gin Gly His phe Asn Gly 

325 330 335 

Val lie Asp Pro Leu Phe val Lys Gly Arg Leu Asp Lys Leu Ala Glu 

340 345 350 

Leu Asn Leu Pro lie Trp He Thr Glu Phe Asp ser Thr His Lys Asp 

355 360 365 

Glu Arg Val Arg Ala Asp Asn Leu Glu Lys Met Tyr Arg Leu Ala Phe 

370 " 375 380 

Ala His Pro Ala val Glu Gly lie val Met Trp Gly Phe Trp Ala Gly 
385 390 395 400 

ser His Trp Lys Gly Thr Asp Gly Ala lie Val Asn Gin Asp Trp Thr 

405 410 415 

Leu Asn Ala Ala Gly Gin Arg Tyr Gin Gin Leu Met Asp Glu Trp Thr 

420 425 430 

Thr Val Val Glu Gly Thr Thr Asp Gin Arg Gly Met Phe ser Phe Arg 

435 440 445 

Gly Phe His Gly Thr Tyr Asp Met Leu Val Asp Tyr Pro Gly Ala Ala 

450 455 460 

Ala Val Lys Gin ser Phe Thr Leu Glu Pro Gly Ser Gly Asn Ala Lys 
465 470 475 480 

Leu His He Pro Phe Asp Val Gin Asp Lys Ser He Pro Glu Ala Pro 

485 490 495 

Ala Lys Leu ser Ala Ala Ala Ala Asp ser Gin val Met Leu ser Trp 




510 
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ser Lys val Asn Gly Ala Thr Gly Tyr Thr Val Lys ser Ala val ser 

515 520 525 

Ala Asp Gly Pro Tyr Thr Pro lie Ala His Gin Leu Leu Thr Glu Thr 

530 535 540 

Phe Thr His lie Gly Leu Val Asn Arg Lys Asp Tyr Tyr Tyr Val Val 
545 550 555 560 

Ser Ala Ser Asn His Leu Gly Glu Ser Pro Asp Ser Ala Pro lie Arg 

565 570 575 

Ala Thr pro Arg Ala Ala Gly Glu Leu Gin Thr Asn Leu val Leu Gin 

580 585 590 

Tyr Arg ser Ala Asp Gly Asp Asn Asn Tyr Gin Met Lys Pro Gin Phe 

595 600 605 

Thr lie Lys Asn Ala Gly Lys Val Pro lie Pro Leu Ser Glu Leu Thr 

610 615 620 

lie Arg Tyr Tyr Phe Thr Pro Glu ser Thr Gin Pro Val Asp Thr Arg 
625 630 635 640 

lie Asp Trp Ala Gin Phe Gly Ala Glu His val Gin Thr Thr val val 

645 650 655 

Pro Pro Ser Asp Ala Ala Ala His Ala Tyr Val Glu Leu ser Phe Leu 

660 665 670 

Glu Ser Ala Gly Ala He Pro Ser Asp Thr Thr Leu Gly Asn lie Gin 

675 680 685 

Leu Arg lie Phe Asn Ser Asp Gly ser Ser Phe Asp Lys Thr Asn Asp 

690 695 700 

Tyr Ser Phe Asp Pro Thr Lys Lys Ala Tyr Thr Ala Trp Glu Lys val 
705 710 715 720 

Thr Leu Tyr Arg Asn Gly Glu Leu Val Trp Gly He Glu Pro Trp Gly 
725 730 735 

Ala Lys 

<210> 345 

<211> 849 

<212> DNA 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 345 

atgaagatga cctacatgca tccggctgaa gatacttact cgtttggtca agcggatcag 60 

ttggtcaact gggcgaaagc gaatggtatt ggcgtgcacg gccacactct ggtttggcac 120 

tccgaatacc aggtacccaa ttggatgaaa aattactctg gtgatgcaac tgcattccaa 180 

accatgctca acacccatgt gaaaactgtg gctgagcatt ttgctggcga actggacagc 240 

tgggacgttg tgaatgaagt gctggagccg ggctccaatg gttgctggcg tgaaaactct 300 

ctgttctacc agaagcttgg caaagacttt gtcgcgaacg cattccgtgc agctcgcgag 360 

ggcgatccca atgcagactt gtattacaac gattactcga ctgaaaatgg tgtaacttcc 420 

gatgagaagt tcagttgttt gttggaacta gtcgatgagc ttctggaagc ggacgtgccg 480 

attacaggtg ttggtttcca aatgcacgtg caggcgacgt ggcctagcaa tgccaacatc 540 

ggcaaggcat tcaaagccat cgcggatcgc ggtctgaaag ttaaaatttc tgagctcgat 600 

gttcctgtta acaaccctta cggaaccact aatttcccgc aatacagcag ttttaccgcg 660 

gaagccgccg agctgcagaa gcagcgctac aagggcatta tgcaagcgta ccttgataac 720 

gtaccggcca acctgcgtgg tggtttcacc gtgtggggcg tttgggatgg cgatagctgg 780 

atcatgacgt tcagccagta caccaacgct aacgccaacg actggccact gttgttcacc 840 

gggccgtag 849 

<210> 346 
<211> 282 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 346 

Met Lys Met Thr Tyr Met His Pro Ala Glu Asp Thr Tyr Ser Phe Gly 

1 5 10 15 

Gin Ala Asp Gin Leu val Asn Trp Ala Lys Ala Asn Gly lie Gly Val 
20 25 30 
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His Gly 

Met Lys 
50 

Thr His 
65 

Trp Asp 

Arg Glu 

Asn Ala 

Tyr Asn 
130 
ser Cys 
145 

lie Thr 

Asn Ala 

Lys Val 

Thr Thr 
210 
Leu Gin 
225 

val Pro 
Gly Asp 
Asn Asp 



His Thr 
35 

Asn Tyr 

val Lys 

Val Val 

Asn ser 
100 
Phe Arg 
115 

Asp Tyr 

Leu Leu 

Gly Val 

Asn lie 
180 
Lys He 
195 

Asn Phe 

Lys Gin 

Ala Asn 

Ser Trp 
260 
Trp Pro 
275 



Leu val 

ser Gly 

Thr Val 
70 
Asn Glu 
85 

Leu Phe 

Ala Ala 

ser Thr 

Glu Leu 
150 
Gly Phe 
165 

Gly Lys 

Ser Glu 

Pro Gin 

Arg Tyr 
230 
Leu Arg 
245 

lie Met 
Leu Leu 



Trp His 

40 
Asp Ala 
55 

Ala Glu 

Val Leu 

Tyr Gin 

Arg Glu 
120 
Glu Asn 
135 

Val Asp 

Gin Met 

Ala Phe 

Leu Asp 
200 
Tyr Ser 
215 

Lys Gly 

Gly Gly 

Thr Phe 

Phe Thr 
280 



Ser Glu 

Thr Ala 

His Phe 

Glu Pro 
90 

Lys Leu 
105 

Gly Asp 

Gly val 

Glu Leu 

His Val 
170 
Lys Ala 
185 

val Pro 

ser Phe 

lie Met 

Phe Thr 
250 
Ser Gin 
265 

Gly Pro 



Tyr Gin Val 
45 

Phe Gin Thr 
60 

Ala Gly Glu 
75 

Gly Ser Asn 
Gly Lys Asp 



Pro Asn Trp 

Met Leu Asn 

Leu Asp Ser 
80 

Gly Cys Trp 
95 

Phe val Ala 
110 

Asp Leu Tyr 



Pro Asn Ala 
125 

Thr Ser Asp Glu Lys Phe 

140 
Leu Glu Ala 
155 

Gin Ala Thr 



Asp val Pro 
160 

Trp Pro Ser 

175 

lie Ala Asp Arg Gly Leu 
190 

Pro Tyr Gly 



Val Asn Asn 
205 
Thr Ala Glu 

220 
Gin Ala Tyr 
235 

val Trp Gly 



Ala Ala Glu 



Leu Asp Asn 
240 

Val Trp Asp 
255 

Tyr Thr Asn Ala Asn Ala 
270 



<210> 347 
<211> 1794 
<212> DNA 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 347 

atgcccgttt tgttcgccct gtttcttgtt gcctcgtcct gcgcggcgca gtcgctggcc . 60 

gggccggttt ccctgcttgg cggagatgcg ggcgcggcgt tccgctatac cgggccatcg 120 

gcgggcgcgg cgagcggctc ggccgaatgg gtggcggtgg agaacatgcc gttcacgcac 180 

gcctggcggc tgcgcacgaa tccgctgccg gagagcggcg gcaacgaatg ggacctgcgc 240 

atccgcgccc gcggagcggc ggctgtttcg gcaggggaca agatcctggc cgagttctgg 300 

atgcgctgcg tggagcccga aaacggcgac tgcattctgc gcctgaacgt ggagcgcgac 360 

gggtcgccgt ggaccaaatc catcagcaac ccctacccgg tgggccggga gtggcggcgg 420 

ttccgcgtgc tgttcgagat gcgggagagc tacgccgccg gcggctacat gatcgatttc 480 

tggatgggcc agcaggtgca gacggcggaa gtgggcggga tttccctgct gaattacggt 540 

ccgcaggcca cggccgagca gcttggcctg gaccggtttt atgagggcgc ggcggcggac 600 

gccgcgtggc ggcaggcggc cgagcagcgg atcgaggaga tccggaaagc gggcatgatc 660 

atcgtggcgg tgacgccgga cggcgagccg atcgagggcg ctgaaatccg ggcgaagctg 720 

aagcggcacg cgttcgggtg gggcacggct gtggcggcat cacggcttct ggggacggga 780 

acggacagcg agcgctaccg caacttcatc cgcgagaact tcaacatggc ggtgctcgag 840 

aacgacctga aatggggccc gttcgaagag aaccgcaacc gcgcgatgaa cgcgctgcgc 900 

tggctgcatg agaacgggat cacgtggatc cgcgggcaca atctcgtctg gccgggctgg 960 

cggtggatgc cgaacgacgt gcgcaacctg gcgaacaatc ccgaggcgct gcggcagcgg 1020 

attctggacc gcatccggga cacggccacg gccacgcgcg ggctggtggt gcactgggac 1080 

gtcgtcaacg agccggtggc cgagcgcgac gtgctgaaca ttctgggcga cgaggtgatg 1140 

gcggactggt tccgcgccgc gaaggagtgc gatcccgagg cgaggatgtt catcaatgag 1200 

tacgacattc tggcggcgaa cggggccaat ctgcggaagc agaacgcgta ttaccgcatg 1260 

atcgagatgc tgttgaagct cgaggcgccg gtggagggca tcggcttcca gggccacttc 1320 

gacacggcca cgccgccgga gcggatgctg gagatcatga accggtacgc ccggctcggg 1380 

ctgccgatcg ccatcaccga gtacgatttc gccacggcgg acgaggagct gcaggcgcag 1440 

ttcacgcgcg acctgatgat tctcgccttc agccatccgg cggtttcgga cttcctgatg 1500 

tggggcttct gggaagggag ccactggaag ccgctgggcg ccatgatccg gcgcgactgg 1560 

agcgagaagc cgatgtaccg cgtctggcgc gagctgatct tcgagcgctg gcagacggat 1620 
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gaaacaggcg tgacgccgga gcacggtgcc atctacgtgc ggggcttcaa gggcgactac 1680 
gagatcacgg tgaaggcggg cgggcaggaa gtccgggtgc cgtacacgct gaaagaagac 1740 
ggccaggtgc tgtgggtgac ggtgggcggg gcttctgaag agcgcgtgca gtaa 1794 

<210> 348 
<211> 597 
<212> PRT 
<213> unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (D...C20) 

<400> 348 

Met Pro Val Leu Phe Ala Leu Phe Leu Val Ala Ser Ser Cys Ala Ala 

15 10 15 

Gin ser Leu Ala Gly pro val Ser Leu Leu Gly Gly Asp Ala Gly Ala 

20 25 30 

Ala Phe Arg Tyr Thr Gly Pro Ser Ala Gly Ala Ala Ser Gly Ser Ala 

35 40 45 

Glu Trp val Ala Val Glu Asn Met Pro Phe Thr His Ala Trp Arg Leu 

50 55 60 

Arg Thr Asn Pro Leu Pro Glu Ser Gly Gly Asn Glu Trp Asp Leu Arg 
65 70 75 80 

lie Arg Ala Arg Gly Ala Ala Ala val ser Ala Gly Asp Lys lie Leu 

85 90 95 

Ala Glu Phe Trp Met Arg Cys val Glu Pro Glu Asn Gly Asp cys lie 

100 " 105 110 

Leu Arg Leu Asn Val Glu Arg Asp Gly Ser Pro Trp Thr Lys Ser lie 

115 120 125 

ser Asn Pro Tyr Pro val Gly Arg Glu Trp Arg Arg Phe Arg val Leu 

130 135 140 

Phe Glu Met Arg Glu Ser Tyr Ala Ala Gly Gly Tyr Met lie Asp Phe 
145 150 155 160 

Trp Met Gly Gin Gin val Gin Thr Ala Glu val Gly Gly lie Ser Leu 

165 170 175 

Leu Asn Tyr Gly Pro Gin Ala Thr Ala Glu Gin Leu Gly Leu Asp Arg 

180 185 190 

Phe Tyr Glu Gly Ala Ala Ala Asp Ala Ala Trp Arg Gin Ala Ala Glu 

195 200 205 

Gin Arg lie Glu Glu lie Arg Lys Ala Gly Met lie lie Val Ala Val 

210 215 220 

Thr Pro Asp Gly Glu Pro lie Glu Gly Ala Glu lie Arg Ala Lys Leu 
225 230 235 240 

Lys Arg His Ala Phe Gly Trp Gly Thr Ala Val Ala Ala Ser Arg Leu 

245 250 255 

Leu Gly Thr Gly Thr Asp ser Glu Arg Tyr Arg Asn Phe lie Arg Glu 

260 265 270 

Asn Phe Asn Met Ala Val Leu Glu Asn Asp Leu Lys Trp Gly Pro Phe 

275 280 285 

Glu Glu Asn Arg Asn Arg Ala Met Asn Ala Leu Arg Trp Leu His Glu 

290 295 300 

Asn Gly lie Thr Trp He Arg Gly His Asn Leu val Trp pro Gly Trp 
305 310 315 320 

Arg Trp Met Pro Asn Asp val Arg Asn Leu Ala Asn Asn Pro Glu Ala 

325 330 335 

Leu Arg Gin Arg lie Leu Asp Arg lie Arg Asp Thr Ala Thr Ala Thr 

340 345 350 

Arg Gly Leu Val Val His Trp Asp Val val Asn Glu Pro Val Ala Glu 

355 360 365 

Arg Asp Val Leu Asn He Leu Gly Asp Glu Val Met Ala Asp Trp Phe 

370 375 380 

Arg Ala Ala Lys Glu cys Asp Pro Glu Ala Arg Met Phe lie Asn Glu 
385 390 395 400 

Tyr Asp lie Leu Ala Ala Asn Gly Ala Asn Leu Arg Lys Gin Asn Ala 

405 410 415 

Tyr Tyr Arg Met lie Glu Met Leu Leu Lys Leu Glu Ala Pro Val Glu 
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Gly lie Gly 
435 

Met Leu Glu 

450 
lie Thr Glu 
465 

Phe Thr Arg 

Asp Phe Leu 

Gly Ala Met 
515 

Trp Arg Glu 

530 
Thr Pro Glu 
545 

Glu lie Thr 

Leu Lys Glu 

Glu Glu Arg 
595 



420 

Phe Gin 

lie Met 

Tyr Asp 

Asp Leu 
485 
Met Trp 
500 

lie Arg 

Leu lie 

His Gly 

val Lys 
565 
Asp Gly 
580 

Val Gin 



Gly His 

Asn Arg 
455 
Phe Ala 
470 

Met lie 

Gly Phe 

Arg Asp 

Phe Glu 
535 
Ala He 
550 

Ala Gly 
Gin val 



425 
Phe Asp 
440 

Tyr Ala 

Thr Ala 

Leu Ala 

Trp Glu 
505 
Trp ser 
520 

Arg Trp 
Tyr Val 
Gly Gin 



Leu Trj 
58 



Thr Ala Thr 

Arg Leu Gly 
460 

Asp Glu Glu 

475 
Phe ser His 
490 

Gly ser His 

Glu Lys Pro 

Gin Thr Asp 
540 

Arg Gly Phe 

555 
Glu Val Arg 
570 

val Thr val 



430 
Pro Pro 
445 

Leu Pro 

Leu Gin 

Pro Ala 

Trp Lys 
510 
Met Tyr 
525 

Glu Thr 

Lys Gly 

Val Pro 

Gly Gly 
590 



Glu Arg 

lie Ala 

Ala Gin 
480 
Val Ser 
495 

Pro Leu 

Arg val 

Gly val 

Asp Tyr 
560 
Tyr Thr 
575 

Ala ser 



<210> 349 
<211> 1794 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 



<400> 349 

atgcccgttt 

gggccggttt 

gcgggcgcgg 

gcctggcggc 

atccgcgccc 

atgcgctgcg 

gggtcgccgt 

ttccgcgtgc 

tggatgggcc 

ccgcaggcca 

gccgcgtggc 

atcgtggcgg 

aagcggcacg 

acggacagcg 

aacgacctga 

tggctgcatg 

cggtggatgc 

attctggacc 

gtcgtcaacg 

gcggactggt 

tacgacattc 

atcgagatgc 

gacacggcta 

ctgccgatcg 

ttcacgcgcg 

tggggcttct 

agcgagaagc 

gaaacgggag 

gaaatcacgg 

ggccaggtgc 



tgttcgccct 
ccctgcttgg 
cgagcggctc 
tgcgcacgaa 
gcggagcggc 
tggagcccga 
ggaccaaatc 
tgttcgagat 
agcaggtgca 
cggccgagca 
ggcaggcggc 
tgacgccgga 
cgttcgggtg 
agcgctaccg 
aatggggccc 
agaacgggat 
cgagcgacgt 
gcatccggga 
agccggtggc 
tccgcgccgc 
tggcggcgaa 
tgttgaagct 
cgccgccgga 
ccatcaccga 
acctgatgat 
gggaagggag 
cgatgtaccg 
tgacgccgga 
tgaaggctgg 
tgtgggtgac 



gtttcttgtt 
cggagatgcg 
ggccgaatgg 
tccgctgccg 
ggctgtttcg 
aaacggcgac 
catcagcaac 
gcgggagagc 
gacggcggaa 
gcttggcctg 
cgagcagcgg 
cggcgagccg 
gggcacggcg 
caacttcatc 
gttcgaggag 
cacgtggatc 
gcgcaacctg 
cacggccacc 
cgagcgcgac 
gaaggagtgc 
cggggccaac 
cgaggcgccg 
gcggatgctg 
gtacgatttc 
tctcgctttc 
ccactggaag 
cgtctggcgc 
gcacggggcc 
cgggcaggaa 
ggtgggcggt 



gcctcgtcct 
ggcgcggcgt 
gtggcggtgg 
gagagcggcg 
gcaggggaca 
tgcattctgc 
ccctacccgg 
tacgccgccg 
gtgggcggga 
gaccggttct 
atcgaggaga 
atcgaaggcg 
gtggcggcat 
cgcgagaact 
aaccgcgccc 
cgcgggcaca 
gcgaacaatc 
gccacgcgcg 
gtgctgaaca 
gatcccgagg 
ctgcggaagc 
gtagagggca 
gagatcatga 
gccacggtag 
agccatccgg 
ccgctgggcg 
gagctgatct 
atctacgtgc 
gtccgggtgc 
acttctgaag 



gcgcggcgca 
tccgctatac 
agaacatgcc 
gcaacgaatg 
agatcctggc 
gcctgaacgt 
tgggccggga 
gcggctacat 
tttccctgct 
atgaaggcgc 
tccggaaagc 
ccgagatccg 
cacggcttct 
tcaacatggc 
gcgcaatgaa 
atctcgtctg 
ccgaagccct 
ggctggtcgt 
ttctgggcga 
cgaggatgtt 
agaacgcgta 
tcggcttcca 
accggtacgc 
acgaagagct 
cggtttcgga 
ccatgatccg 
tcgagcgctg 
ggggcttcaa 
cgtacacgct 
agcaggcgcc 



gtcgctggcc 
cgggccatcg 
gttcacgcac 
ggacctgcgc 
cgagttctgg 
ggagcgcgac 
gtggcggcgg 
gatcgatttc 
gaattacggt 
ggcggcggac 
gggcatgatc 
ggcgaagctg 
ggggacggga 
ggtgctcgag 
cgcgctgcgc 
gccaggctgg 
gcggcagcgg 
gcactgggac 
cgaggtgatg 
catcaacgaa 
ctaccgcatg 
gggccatttc 
ccggctcggg 
gcaggcgcag 
cttcttgatg 
gcgcgactgg 
gcagacggat 
gggcgactac 
gaaagaagac 
gtaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1794 



<210> 350 
<211> 597 
<212> PRT 
<213> Unknown 
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<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (D...C20) 

<400> 350 

Met Pro val Leu Phe Ala Leu Phe Leu Val Ala Ser Ser Cys Ala Ala 

15 10 15 

Gin Ser Leu Ala Gly Pro Val Ser Leu Leu Gly Gly Asp Ala Gly Ala 

20 25 30 

Ala Phe Arg Tyr Thr Gly Pro Ser Ala Gly Ala Ala ser Gly Ser Ala 

35 40 45 

Glu Trp Val Ala Val Glu Asn Met Pro Phe Thr His Ala Trp Arg Leu 

50 55 60 " ~ 

Arg Thr Asn Pro Leu pro Glu Ser Gly Gly Asn Glu Trp Asp Leu Arg 
65 70 75 80 

lie Arg Ala Arg Gly Ala Ala Ala Val ser Ala Gly Asp Lys lie Leu 

85 90 95 

Ala Glu Phe Trp Met Arg Cys Val Glu Pro Glu Asn Gly Asp Cys lie 

100 105 110 

Leu Arg Leu Asn val Glu Arg Asp Gly ser pro Trp Thr Lys Ser lie 

115 120 125 

Ser Asn Pro Tyr Pro Val Gly Arg Glu Trp Arg Arg Phe Arg Val Leu 

130 " 135 140 

Phe Glu Met Arg Glu Ser Tyr Ala Ala Gly Gly Tyr Met lie Asp Phe 
145 ~ 150 155 160 

Trp Met Gly Gin Gin val Gin Thr Ala Glu Val Gly Gly He Ser Leu 

165 170 - 175 

Leu Asn Tyr Gly Pro Gin Ala Thr Ala Glu Gin Leu Gly Leu Asp Arg 

180 185 190 

Phe Tyr Glu Gly Ala Ala Ala Asp Ala Ala Trp Arg Gin Ala Ala Glu 

195 200 205 

Gin Arg lie Glu Glu lie Arg Lys Ala Gly Met lie lie Val Ala Val 

210 215 220 

Thr Pro Asp Gly Glu Pro lie Glu Gly Ala Glu lie Arg Ala Lys Leu 
225 230 235 240 

Lys Arg His Ala Phe Gly Trp Gly Thr Ala val Ala Ala Ser Arg Leu 

245 250 255 

Leu Gly Thr Gly Thr Asp Ser Glu Arg Tyr Arg Asn Phe lie Arg Glu 

260 265 270 

Asn Phe Asn Met Ala val Leu Glu Asn Asp Leu Lys Trp Gly Pro Phe 

275 280 285 

Glu Glu Asn Arg Ala Arg Ala Met Asn Ala Leu Arg Trp Leu His Glu 

290 295 300 

Asn Gly lie Thr Trp lie Arg Gly His Asn Leu Val Trp Pro Gly Trp 
305 310 315 320 

Arg Trp Met Pro ser Asp val Arg Asn Leu Ala Asn Asn Pro Glu Ala 

325 ~ 330 335 

Leu Arg Gin Arg lie Leu Asp Arg lie Arg Asp Thr Ala Thr Ala Thr 

340 345 350 

Arg Gly Leu Val val His Trp Asp val val Asn Glu pro val Ala Glu 

355 360 365 

Arg Asp val Leu Asn lie Leu Gly Asp Glu Val Met Ala Asp Trp Phe 

370 375 380 

Arg Ala Ala Lys Glu Cys Asp Pro Glu Ala Arg Met Phe lie Asn Glu 
385 390 395 400 

Tyr Asp He Leu Ala Ala Asn Gly Ala Asn Leu Arg Lys Gin Asn Ala 

405 410 415 

Tyr Tyr Arg Met lie Glu Met Leu Leu Lys Leu Glu Ala pro val Glu 

420 425 430 

Gly lie Gly Phe Gin Gly His Phe Asp Thr Ala Thr Pro Pro Glu Arg 

435 440 445 

Met Leu Glu He Met Asn Arg Tyr Ala Arg Leu Gly Leu Pro lie Ala 

450 455 460 

He Thr Glu Tyr Asp Phe Ala Thr val Asp Glu Glu Leu Gin Ala Gin 
465 470 475 480 

Phe Thr Arg Asp Leu Met lie Leu Ala Phe Ser His pro Ala val ser 
485 490 495 
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Asp phe Leu Met 
500 

Gly Ala Met lie 
515 

Trp Arg Glu Leu 
530 

Thr Pro Glu His 
545 

Glu lie Thr Val 

Leu Lys Glu Asp 
580 

Glu Glu Gin Ala 
595 

<210> 351 
<211> 1860 
<212> DNA 
<213> Unknown 



<220> 

<223> Obtained from an environmental sample. 



Trp Gly Phe Trp Glu Gly Ser His Trp Lys Pro Leu 

505 510 
Arg Arg Asp Trp ser Glu Lys Pro Met Tyr Arg Val 

520 525 
He Phe Glu Arg Trp Gin Thr Asp Glu Thr Gly Val 

535 540 
Gly Ala lie Tyr val Arg Gly Phe Lys Gly Asp Tyr 
550 555 560 

Lys Ala Gly Gly Gin Glu Val Arg val Pro Tyr Thr 
565 570 575 

Gly Gin Val Leu Trp Val Thr Val Gly Gly Thr Ser 
585 590 

Pro 



<400> 351 

atgcaccttc 

ctgctggccg 

gagccaactc 

accccgactc 

gaaccaacac 

cccagcaatg 

cgcggttcgg 

agtgttaccg 

ccgggtaata 

gtggtcaaaa 

gtcggtacgg 

gatagcgcca 

agcttctacg 

ccgcccccaa 

gtgggcgttg 

caacaagata 

gaatatttca 

gagcgaggta 

gtcagtcctc 

tcccggtatg 

gatgtctccc 

ccagagttca 

tactacaacg 

cttgagaggt 

ttgttggatt 

gaccccgatg 

tacgacgata 

tccggggtct 

tactttgacg 

cattatagtt 

aacctacagc 



caaattaccg 
tcagtcttgt 
cggtcccaac 
cgactccgga 
cgacgccgga 
acattgccgt 
catccattag 
gccgagaaga 
gctatgaagt 
tcacgggtaa 
cattggctac 
gcccatttga 
tggacgcgtt 
ccgctccgcc 
ccgttgcggt 
tcgtgcttaa 
acgatgacta 
ttcgggttca 
cggtaagcaa 
cggacacggt 
cgggtggaag 
ttgatatcgc 
actacaatat 
tgagggataa 
ggcccgatat 
accgtatgct 
atctcgaaag 
gcgaaggctt 
tcgtgccacc 
ggtattatac 
caaagcctgc 



ctcgctcgca 
ggcctgcggg 
gccgactcca 
gcccactccg 
gccgacgcca 
caatggcgac 
ccgagtcact 
cgactggcat 
ggctgcgtgg 
gcgcgagggc 
cgacggtagc 
atattttatt 
ttcagtggcc 
accgagtggc 
agctagtttt 
caattttagt 
ctccaacccc 
cggccacgct 
ctttcgcgag 
agtaagctgg 
ctactaccgg 
tttccgtgaa 
tgagaacgga 
tgacgtgccc 
cagcactatt 
tttaaaaata 
aggcatcgtt 
tgagcggcaa 
gcaccgccga 
ccatgaaggc 
ttacaacgct 



acggcgctca 
ggcaataatg 
actccaaccc 
actccagagc 
acaccggatc 
gtggaaagcg 
ttagagagct 
ggcgccacct 
gtcaagttag 
gagagcgcga 
tggaccgaaa 
gtggagaccc 
ggtgaggtgg 
tcaggcctag 
gccaataacg 
gaaattgttg 
agggcagatc 
ttggtctggc 
cgttatgtca 
gatgtcgtca 
caatctgagt 
gctcgagagg 
ctggacaaaa 
attgatggcg 
cgacgttcct 
acggaactcg 
cactccagcc 
gccgctcgtt 
ggtggcatca 
tatgtcgatt 
gtttatgaag 



gtcgctacag 
accaagatcc 
cgaccccgac 
caactccgac 
ccggggccga 
gtactaccaa 
ttgaaggtga 
tctctgtagg 
cctcaggcga 
cttacgaaga 
ttaccggcac 
aagagggtgg 
aagatacgcc 
cggaactagt 
atttcctgag 
cggagaatca 
aactggtcag 
acgcgcaagc 
accatgttcg 
atgaagcttt 
tctaccgaca 
cagcccccaa 
ccgatggttt 
tgggcttcca 
gggagcgcgc 
atgtgcgtat 
ggggtgactg 
accgggagat 
gtgtttgggg 
ggcccctttt 
ctctgcagca 



ttgctcggcg 
gccgaccccg 
tccggagcca 
gccgaccccg 
ctaccagccg 
ctggggtgca 
tgccagcttg 
ccatctgacc 
gcccaacaca 
gtacacggat 
ttatattcct 
accgaccgtt 
agcgccaacg 
ggatttcccg 
taacacgcaa 
gatgaagatg 
ctgggccaat 
agcgtcatgg 
tggtgtggca 
gaccgatgat 
gttcaatggc 
tgcgctgctt 
gattcagcta 
aatgcatgtt 
attagcggtt 
caacaatccc 
cgatgacatt 
tattgaggcc 
tattgccgac 
gtgggacagg 
gggccaataa 



<210> 352 
<211> 619 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> CD... (73) 

<400> 352 

Met His Leu Pro Asn Tyr Arg ser Leu Ala Thr Ala Leu ser Arg Tyr 
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15 10 15 

ser cys Ser Ala Leu Leu Ala Val Ser Leu Val Ala Cys Gly Gly Asn 

20 25 - 30 

Asn Asp Gin Asp Pro Pro Thr Pro Glu Pro Thr Pro Val Pro Thr Pro 

35 40 45 

Thr Pro Thr Pro Thr Pro Thr Pro Thr Pro Glu Pro Thr Pro Thr pro 

50 55 60 

Thr Pro Glu Pro Thr Pro Thr Pro Glu Pro Thr Pro Thr Pro Thr Pro 
65 70 75 80 

Glu Pro Thr Pro Thr Pro Glu Pro Thr Pro Thr Pro Asp Pro Gly Ala 

85 90 95 

Asp Tyr Gin Pro Pro ser Asn Asp lie Ala Val Asn Gly Asp Val Glu 

100 105 110 

ser Gly Thr Thr Asn Trp Gly Ala Arg Gly Ser Ala ser lie ser Arg 

115 120 125 

Val Thr Leu Glu ser Phe Glu Gly Asp Ala ser Leu ser Val Thr Gly 

130 135 140 

Arg Glu Asp Asp Trp His Gly Ala Thr Phe Ser Val Gly His Leu Thr 
145 150 155 160 

pro Gly Asn Ser Tyr Glu Val Ala Ala Trp Val Lys Leu Ala Ser Gly 

165 170 175 

Glu Pro Asn Thr val Val Lys lie Thr Gly Lys Arg Glu Gly Glu Ser 

180 185 ~ 190 

Ala Thr Tyr Glu Glu Tyr Thr Asp val Gly Thr Ala Leu Ala Thr Asp 

195 200 205 

Gly Ser Trp Thr Glu lie Thr Gly Thr Tyr lie Pro Asp ser Ala Ser 

210 215 220 

Pro Phe Glu Tyr Phe He Val Glu Thr Gin Glu Gly Gly Pro Thr val 
225 230 235 240 

ser Phe Tyr Val Asp Ala Phe Ser val Ala Gly Glu Val Glu Asp Thr 

245 250 255 

pro Ala Pro Thr pro Pro Pro Thr Ala Pro Pro Pro Ser Gly Ser Gly 

260 265 270 

Leu Ala Glu Leu Val Asp Phe Pro Val Gly Val Ala Val Ala Val Ala 

275 280 285 

Ser Phe Ala Asn Asn Asp Phe Leu Ser Asn Thr Gin Gin Gin Asp lie 

290 295 300 

Val Leu Asn Asn Phe ser Glu lie val Ala Glu Asn Gin Met Lys Met 
305 310 315 320 

Glu Tyr Phe Asn Asp Asp Tyr Ser Asn Pro Arg Ala Asp Gin Leu Val 

325 330 335 

ser Trp Ala Asn Glu Arg Gly lie Arg Val His Gly His Ala Leu Val 

340 345 350 

Trp His Ala Gin Ala Ala Ser Trp Val Ser Pro Pro val ser Asn Phe 

355 360 365 

Arg Glu Arg Tyr val Asn His val Arg Gly val Ala Ser Arg Tyr Ala 

370 375 380 

Asp Thr val Val ser Trp Asp Val Val Asn Glu Ala Leu Thr Asp Asp 
385 390 395 400 

Asp val Ser Pro Gly Gly Ser Tyr Tyr Arg Gin Ser Glu Phe Tyr Arg 

405 410 415 

Gin Phe Asn Gly Pro Glu Phe lie Asp lie Ala Phe Arg Glu Ala Arg 

420 425 430 

Glu Ala Ala Pro Asn Ala Leu Leu Tyr Tyr Asn Asp Tyr Asn lie Glu 

435 440 445 

Asn Gly Leu Asp Lys Thr Asp Gly Leu lie Gin Leu Leu Glu Arg Leu 

450 455 460 

Arg Asp Asn Asp Val Pro He Asp Gly Val Gly Phe Gin Met His val 
465 470 475 480 

Leu Leu Asp Trp Pro Asp lie ser Thr lie Arg Arg ser Trp Glu Arg 

485 490 495 

Ala Leu Ala Val Asp Pro Asp Asp Arg Met Leu Leu Lys lie Thr Glu 

500 505 510 

Leu Asp val Arg lie Asn Asn Pro Tyr Asp Asp Asn Leu Glu Arg Gly 

K 515 520 525 

lie val His Ser ser Arg Gly Asp cys Asp Asp lie ser Gly val cys 

530 535 540 

Glu Gly Phe Glu Arg Gin Ala Ala Arg Tyr Arg Glu lie lie Glu Ala 
545 550 555 560 
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Tyr Phe Asp val val Pro Pro His Arg Arg Gly Gly lie ser Val Trp 

565 570 575 

Gly lie Ala Asp His Tyr Ser Trp Tyr Tyr Thr His Glu Gly Tyr Val 

580 585 590 

Asp Trp Pro Leu Leu Trp Asp Arg Asn Leu Gin Pro Lys Pro Ala Tyr 

595 600 605 

Asn Ala Val Tyr Glu Ala Leu Gin Gin Gly Gin 
610 615 

<210> 353 
<211> 1983 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 



<400> 353 

atggtttgct 

tcaaccggta 

atgaccctcg 

gtgggtggta 

tatggcgtca 

ctgattgaat 

ggtaccaata 

cgcgtgcaac 

aatccgaaaa 

ttttgggcga 

ggttatcaaa 

accagcagca 

ggtaacaccg 

aacctgcgta 

ctcagctaca 

agccgcgatg 

atgagctaca 

ctgatgcact 

tgctccggcg 

ggtggtggca 

gctaacactc 

ttcaaccagg 

ggtgaagtac 

gtttatgacg 

accctgttcc 

ttgggtttgc 

tcggccatcg 

ggcagctaca 

ggcttgtgcc 

aatactggcg 

agcagccgta 

acccgttacc 

tgtatcacca 

taa 



cggccgccct 
ccaacaacgg 
cagcgggtgg 
aaggctggaa 
gcaatagcca 
actacgtgat 
tgggtagctt 
agccatccat 
aaggcttcgg 
gcaagggcat 
gtaccggcag 
cgcccccggt 
gtagtggcgt 
tcggtggcaa 
gcggcaccgc 
tcgtagtgga 
acaccgcctt 
gcaatggcgt 
gcagcacggg 
acagcaactg 
ccaccctggt 
gcaacaacgt 
ataaccacag 
agttgaaccg 
gcccacctta 
gggtgatcac 
ccagcgcagc 
ccaacaccaa 
cgggtcgtat 
gcggtactgt 
gcagcagtag 
cgatctgtac 
ccagcacctg 



cggggcgacc 
tttctactac 
tcgttacacc 
cccgggcaat 
aaattcctac 
cgaaagctac 
ccagagtgat 
cgatggcacc 
ccaaatttcc 
gaacctgggt 
ttccgatatc 
aacaaccagc 
agtggtgcgc 
taccgtcgcc 
gtcgggcgat 
ttacatccgc 
gtacgcgaat 
gatcggtttt 
gggtggcaac 
ctccggctat 
taacctgttg 
ggttgccaat 
ctacagccat 
tgccaaccag 
tggcaccgtt 
ctgggatgtg 
taaccgcctg 
tgccgctatt 
cgatccggca 
gtcaagcagc 
tgttgccgct 
gtccactgcc 
taacagtcag 



gccgtcagtg 
accttttgga 
tcccaatgga 
agcacccgcg 
ctggccttgt 
ggctcctaca 
ggcgcgacct 
caaaccttct 
ggcaccatca 
gcccataact 
acagtgagtg 
agcagtgcca 
gcgcgcggcg 
agctggaacc 
atccaggtgc 
gttaacggcg 
ggttcctgcg 
ggctacacct 
actggcacca 
gttggtatta 
aagcaaaaca 
gccaattaca 
ccgcaaatgg 
gctatccaaa 
aactccacca 
gattcgcaag 
acaaacggcc 
gcccagatcg 
accggccgag 
actcgcagca 
ggcggcgcct 
agtggttggg 
ggccctggcg 



cccaaacctt 
aggactcggg 
ccaacaacac 
tgatcagcta 
acggctggac 
acccggcatc 
acgatgtgcg 
accaatattt 
cctttgctaa 
atcaagtaat 
aaggccccat 
gctcagtggc 
tcgcgggcgg 
tcaccaccag 
aatatgacaa 
aaacccgcca 
gtggcggcgg 
atgattgctt 
gcagcagcgc 
ctttcgacga 
acctgacgcc 
tggcccagca 
gcagcatgac 
ccgcgggtgc 
tccaacaagc 
actggaatgg 
aggtaatcct 
cctccagcct 
ccgttgctcc 
gcactccggt 
gccaatgcaa 
gttgggagaa 
gtggtggggt 



gagcaacaac 
cagcgcgacc 
caataactgg 
ttccggtaac 
ccgcagcccg 
ctgctctggc 
ccgttgccag 
cagcgtgcgc 
ccacgcagcc 
ggcgaccgag 
caacggcggc 
aacgggcggt 
cgagcatatt 
cttccaggat 
cgacggcggc 
ggcggaggat 
caattccgaa 
cagcggcaac 
ggcgagcgcc 
tggcccaacc 
ggtaacctgg 
attgagcgtt 
ctaccaacag 
acccaagcca 
ggcccaggcc 
cgccactgcc 
gatgcacgac 
gcgcgccaag 
tgcgggtggc 
agtggtaagc 
ttggtggggc 
caaccgcagc 
agtgtgtaac 



<210> 354 
<211> 660 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(18) 

<400> 354 

Met val cys ser Ala Ala Leu Gly Ala Thr Ala Val ser Ala Gin Thr 

1 5 10 15 

Leu Ser Asn Asn ser Thr Gly Thr Asn Asn Gly Phe Tyr Tyr Thr Phe 
20 25 30 
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Trp Lys Asp Ser Gly Ser Ala Thr Met Thr Leu Ala Ala Gly Gly Arg 

35 40 45 

Tyr Thr Ser Gin Trp Thr Asn Asn Thr Asn Asn Trp val Gly Gly Lys 

50 55 60 

Gly Trp Asn Pro Gly Asn Ser Thr Arg Val lie Ser Tyr Ser Gly Asn 
65 70 75 80 

Tyr Gly val Ser Asn Ser Gin Asn ser Tyr Leu Ala Leu Tyr Gly Trp 

85 90 95 

Thr Arg ser Pro Leu lie Glu Tyr Tyr Val lie Glu ser Tyr Gly ser 

100 105 110 

Tyr Asn Pro Ala Ser Cys Ser Gly Gly Thr Asn Met Gly ser Phe Gin 

115 120 125 

Ser Asp Gly Ala Thr Tyr Asp Val Arg Arg Cys Gin Arg Val Gin Gin 

130 135 140 

Pro ser He Asp Gly Thr Gin Thr Phe Tyr Gin Tyr Phe ser Val Arg 
145 150 155 160 

Asn Pro Lys Lys Gly Phe Gly Gin lie Ser Gly Thr lie Thr Phe Ala 

165 170 175 

Asn His Ala Ala Phe Trp Ala Ser Lys Gly Met Asn Leu Gly Ala His 

180 185 190 

Asn Tyr Gin Val Met Ala Thr Glu Gly Tyr Gin Ser Thr Gly Ser Ser 

195 200 205 

Asp lie Thr Val Ser Glu Gly Pro lie Asn Gly Gly Thr Ser Ser Thr 

210 215 220 

pro Pro Val Thr Thr Ser ser Ser Ala Ser Ser Val Ala Thr Gly Gly 
225 230 235 240 

Gly Asn Thr Gly Ser Gly val val Val Arg Ala Arg Gly Val Ala Gly 

245 250 255 

Gly Glu His lie Asn Leu Arg lie Gly Gly Asn Thr val Ala Ser Trp 

260 265 270 

Asn Leu Thr Thr ser Phe Gin Asp Leu ser Tyr Ser Gly Thr Ala Ser 

275 280 285 

Gly Asp lie Gin Val Gin Tyr Asp Asn Asp Gly Gly ser Arg Asp Val 

290 295 300 

Val Val Asp Tyr He Arg Val Asn Gly Glu Thr Arg Gin Ala Glu Asp 
305 310 315 320 

Met Ser Tyr Asn Thr Ala Leu Tyr Ala Asn Gly ser cys Gly Gly Gly 

325 330 335 

Gly Asn ser Glu Leu Met His cys Asn Gly Val lie Gly Phe Gly Tyr 

340 345 350 

Thr Tyr Asp Cys Phe ser Gly Asn Cys ser Gly Gly Ser Thr Gly Gly 

355 360 365 

Gly Asn Thr Gly Thr ser Ser Ser Ala Ala Ser Ala Gly Gly Gly Asn 

370 375 380 

ser Asn cys Ser Gly Tyr val Gly lie Thr Phe Asp Asp Gly Pro Thr 
385 390 395 400 

Ala Asn Thr Pro Thr Leu Val Asn Leu Leu Lys Gin Asn Asn Leu Thr 

405 410 415 

Pro val Thr Trp Phe Asn Gin Gly Asn Asn Val Val Ala Asn Ala Asn 

420 425 430 

Tyr Met Ala Gin Gin Leu Ser Val Gly Glu Val His Asn His Ser Tyr 

435 440 445 

ser His Pro Gin Met Gly Ser Met Thr Tyr Gin Gin Val Tyr Asp Glu 

450 455 460 

Leu Asn Arg Ala Asn Gin Ala lie Gin Thr Ala Gly Ala Pro Lys Pro 
465 470 475 480 

Thr Leu Phe Arg Pro Pro Tyr Gly Thr Val Asn Ser Thr lie Gin Gin 

485 490 495 

Ala Ala Gin Ala Leu Gly Leu Arg Val lie Thr Trp Asp val Asp Ser 

500 505 510 

Gin Asp Trp Asn Gly Ala Thr Ala ser Ala lie Ala ser Ala Ala Asn 

515 520 525 

Arg Leu Thr Asn Gly Gin val He Leu Met His Asp Gly Ser Tyr Thr 

530 535 540 

Asn Thr Asn Ala Ala lie Ala Gin lie Ala Ser Ser Leu Arg Ala Lys 
545 550 555 560 

Gly Leu Cys Pro Gly Arg He Asp Pro Ala Thr Gly Arg Ala val Ala 

565 570 575 

pro Ala Gly Gly Asn Thr Gly Gly Gly Thr val Ser ser Ser Thr Arg 
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580 585 590 

Ser ser Thr Pro val val Val Ser Ser Ser Arg Ser ser ser Ser Val 

595 600 605 

Ala Ala Gly Gly Ala cys Gin Cys Asn Trp Trp Gly Thr Arg Tyr Pro 

610 615 620 

lie Cys Thr Ser Thr Ala Ser Gly Trp Gly Trp Glu Asn Asn Arg Ser 
625 630 635 640 

Cys lie Thr Thr ser Thr cys Asn ser Gin Gly Pro Gly Gly Gly Gly 
645 650 655 

Val Val cys Asn 
660 

<210> 355 
<211> 1125 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 



<400> 355 

atgaaacgaa 

gcggccgggg 

tccctcaaag 

ttatcctctc 

aacgtcatga 

gacaagatcg 

tggcataatc 

cgcttggaga 

gacgtggtaa 

taccagatca 

aggaagaatg 

gatccgacga 

cccgtcgacg 

atcgtggact 

gatgtgagca 

gaagtcttca 

aaaaactacc 

cattggccca 

ccggcgtatt 



ccatcttctt 
ggtgccggcc 
aggtctgcgc 
caccccacga 
agcccgcctc 
tgaactacgc 
aggtcccgga 
atcatatccg 
atgaggctat 
ccgggaagga 
ggtggaggaa 
agagagagta 
gagtaggcca 
cgatccgctt 
tatatacgga 
tcaagcaggg 
tcagcaacgt 
tcgaacggcc 

gggggatcgt 



aagactctta 
ctctagtccc 
ttcttatttc 
tgcccttctg 
catccagcct 
caaagcccac 
ttggttcttc 
gactatcatt 
agacccgagc 
ctacatcaag 
agaaggtcgt 
catctggcgc 
ccagacgcat 
cttcggggaa 
tagatccagt 
taatcgctac 
caccttctgg 
cgatgctcct 
ggatgctttg 



gccggcgccc 
ccgaaggtcg 
gagatcggcg 
cggaaacatt 
tcggaggggt 
gggatgaagc 
gcgggtaacg 
aaaagatatg 
caaccggatg 
accgccttcc 
cagctctata 
ttgatcgatg 
atcaatatcg 
aaaggcctcg 
tcctacggga 
aaggaactct 
ggcatggcgg 
cttcctttcg 
aagctttcgc 



tcctctccgc 
agatcgaggc 
cggccgtcga 
ttaactgcct 
atttcaactg 
tccgcttcca 
acaaaaccct 
gcgataaggt 
gcatgaggag 

gggtggcaga 

tcaacgacta 
agcttcaaac 
aatggccgcc 
ataaccaggt 
gttaccaagc 
ttgaagggct 
acgatcatac 
atatctatct 
ggtga 



cgcggccctc 
caatatcccc 
gccgtatcag 
cgtggcggag 
gaccgaagca 
taccctcgtc 
ccttttgcag 
cgactattgg 
gagcaaatgg 
cgacgagctc 
caacacccat 
ggaagggatt 
cgtaaaccag 
gaccgagctg 
gatcccgcag 
aaaaagtgta 
ctggctgaac 
caaggccaag 



<210> 356 

<211> 374 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from 

<221> SIGNAL 
<222> (1)...(21) 

<400> 356 

Met Lys Arg Thr lie 

1 5 
Ala Ala Ala Leu Ala 
20 

val Glu lie Glu Ala 
35 

Tyr Phe Glu lie Gly 
50 

Pro His Asp Ala Leu 
65 

Asn val Met Lys Pro 
85 

Trp Thr Glu Ala Asp 
100 

Lys Leu Arg Phe His 
115 



an environmental sample. 



Phe Leu 

Ala Gly 

Asn lie 

Ala Ala 

55 
Leu Arg 
70 

Ala Ser 
Lys He 
Thr Leu 



Arg Leu 

Gly Cys 

25 
Pro ser 
40 

val Glu 

Lys His 

lie Gin 

val Asn 
105 
val Trp 
120 



Leu Ala Gly 
10 

Arg Pro Ser 

Leu Lys Glu 

Pro Tyr Gin 
60 

Phe Asn Cys 
75 

pro Ser Glu 
90 

Tyr Ala Lys 
His Asn Gin 
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Ala Leu 

Ser Pro 
30 

val cys 
45 

Leu ser 
Leu Val 
Gly Tyr 



Leu Ser 
15 

Pro Lys 

Ala ser 

Ser Pro 

Ala Glu 
80 

Phe Asn 
95 

Gly Met 



Ala His 
110 

val Pro Asp Trp 
125 



60 
120 
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1125 
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Phe Phe Ala Gly Asn Asp Lys Thr Leu Leu Leu Gin Arg Leu Glu Asn 

130 135 140 

His lie Arg Thr lie lie Lys Arg Tyr Gly Asp Lys val Asp Tyr Trp 
145 ~ 150 ~ 155 160 

Asp Val Val Asn Glu Ala lie Asp Pro Ser Gin Pro Asp Gly Met Arg 

165 170 175 

Arg ser Lys Trp Tyr Gin He Thr Gly Lys Asp Tyr lie Lys Thr Ala 

180 185 190 

Phe Arg val Ala Asp Asp Glu Leu Arg Lys Asn Gly Trp Arg Lys Glu 

195 200 20$ 

Gly Arg Gin Leu Tyr lie Asn Asp Tyr Asn Thr His Asp Pro Thr Lys 

210 215 220 

Arg Glu Tyr lie Trp Arg Leu lie Asp Glu Leu Gin Thr Glu Gly He 
225 230 235 240 

Pro val Asp Gly Val Gly His Gin Thr His He Asn He Glu Trp Pro 

245 250 255 

Pro val Asn Gin lie val Asp ser He Arg Phe Phe Gly Glu Lys Gly 

260 265 " 270 

Leu Asp Asn Gin Val Thr Glu Leu Asp Val Ser lie Tyr Thr Asp Arg 

275 280 285 

Ser Ser ser Tyr Gly ser Tyr Gin Ala He Pro Gin Glu Val Phe He 

290 295 300 

Lys Gin Gly Asn Arg Tyr Lys Glu Leu Phe Glu Gly Leu Lys Ser val 
305 310 315 320 

Lys Asn Tyr Leu ser Asn Val Thr phe Trp Gly Met Ala Asp Asp His 

325 330 335 

Thr Trp Leu Asn His Trp Pro lie Glu Arg Pro Asp Ala Pro Leu Pro 

340 345 350 

Phe Asp He Tyr Leu Lys Ala Lys Pro Ala Tyr Trp Gly lie Val Asp 

355 360 365 

Ala Leu Lys Leu ser Arg 
370 

<210> 357 
<211> 1155 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 357 

atgaataact tcagaaatac atttctaatc gtcgttgtac tggcggtcgt cgtcggcgtg 60 

ctgccggcct gcgaagccgg tccgccggaa aatacaagtt cgtccctgca ggaggcatat 120 

gcagatgtgt ttctgatcgg caccgcgctc aatctggcac agatcgacgg aagggatgaa 180 

caaggcgtac gtctggtgga gcggcatttt aatgcgatta caccagagaa cattacaaaa 240 

tggggaccga tacatccggc gccgggcgaa tataatttcg gaccggccga ccggtttgtt 300 

gaattcggtg aagcccacga catgttcatg ataggccata cgcttgtatg gcacagccag 360 

acgcccggat gggtattcga ggatgaagcc ggaaatccgc tcggccgcga cgagctcatc 420 

gaacgcatgc gcgatcatat ccataccgtc gtcggacggt accggggtag aatacacgca 480 

tgggacgtcg tcaacgaagc gttgaatgaa gacggaaccc tgcgggaatc cccctggtac 540 

cgtatcatcg gcgaggatta cctgttgaaa gcgttcgagt tcgcgcatga agcggacccg 600 

gatgccgagc tgtactataa cgattattct ctcgaaaatc ccgccaagcg ggcgggggcg 660 

gtacgcctgg tccggtacct gcaggagaac ggggcgccga tacacgggat cggtacccag 720 

ggacactact ctcttgactg gccatcgctc gacgagatcg aaagaaccat caccgatttc 780 

gccgcgttgg acgtggacgt catggttacc gaacttgaaa tcgacgtcct cccttccgcg 840 

ttcgagtatc agggggccga tattgcgatg cgggcggaac tcgaagagcg gttgaatccg 900 

tatcccgacg aactgccggc cgaggtcgat gaagcgctta cacagcggta tcgggacatc 960 

ttcgaggtat ttctgcggca cagcgacgtt cttacgcgcg taacgttctg gggggtgacc 1020 

gatggagatt cgtggaagaa taactggccg gtaccgggaa ggacgaatta tccgctgctg 1080 

ttcgaccgcg aatggcagcc aaaaccagca ttttattccg tgatcgaagt tgcggatgag 1140 

atgctgaatg aataa 1155 

<210> 358 
<211> 384 
<212> PRT 
<213> Unknown 

<220> 
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<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(25) 

<400> 358 

Met Asn Asn Phe Arg Asn Thr Phe Leu lie Val Val Val Leu Ala Val 

15 10 15 

val val Gly val Leu pro Ala cys Glu Ala Gly Pro Pro Glu Asn Thr 

20 25 30 

ser ser Ser Leu Gin Glu Ala Tyr Ala Asp val Phe Leu lie Gly Thr 

35 40 45 

Ala Leu Asn Leu Ala Gin lie Asp Gly Arg Asp Glu Gin Gly Val Arg 

50 55 60 

Leu Val Glu Arg His Phe Asn Ala lie Thr Pro Glu Asn lie Thr Lys 
65 ~ 70 75 80 

Trp Gly Pro He His pro Ala Pro Gly Glu Tyr Asn Phe Gly Pro Ala 

85 90 95 

Asp Arg Phe Val Glu phe Gly Glu Ala His Asp Met Phe Met lie Gly 

100 105 110 

His Thr Leu Val Trp His ser Gin Thr Pro Gly Trp Val Phe Glu Asp 

115 120 125 

Glu Ala Gly Asn Pro Leu Gly Arg Asp Glu Leu lie Glu Arg Met Arg 

130 135 140 

Asp His lie His Thr Val Val Gly Arg Tyr Arg Gly Arg lie His Ala 
145 150 - — 155 16Q 

Trp Asp Val Val Asn Glu Ala Leu Asn Glu Asp Gly Thr Leu Arg Glu 

165 170 175 

Ser Pro Trp Tyr Arg lie lie Gly Glu Asp Tyr Leu Leu Lys Ala Phe 

180 185 190 

Glu Phe Ala His Glu Ala Asp Pro Asp Ala Glu Leu Tyr Tyr Asn Asp 

195 200 205 

Tyr Ser Leu Glu Asn Pro Ala Lys Arg Ala Gly Ala val Arg Leu Val 

210 215 " 220 

Arg Tyr Leu Gin Glu Asn Gly Ala Pro lie His Gly lie Gly Thr Gin 
225 230 235 240 

Gly His Tyr Ser Leu Asp Trp Pro ser Leu Asp Glu lie Glu Arg Thr 

245 250 255 

lie Thr Asp Phe Ala Ala Leu Asp val Asp val Met val Thr Glu Leu 

260 265 270 

Glu lie Asp Val Leu Pro Ser Ala Phe Glu Tyr Gin Gly Ala Asp lie 

275 280 285 

Ala Met Arg Ala Glu Leu Glu Glu Arg Leu Asn Pro Tyr Pro Asp Glu 

290 295 300 

Leu pro Ala Glu val Asp Glu Ala Leu Thr Gin Arg Tyr Arg Asp lie 
305 310 315 320 

Phe Glu val Phe Leu Arg His Ser Asp val Leu Thr Arg val Thr Phe 

325 330 ~ 335 

Trp Gly Val Thr Asp Gly Asp Ser Trp Lys Asn Asn Trp pro Val Pro 

340 345 350 

Gly Arg Thr Asn Tyr Pro Leu Leu Phe Asp Arg Glu Trp Gin Pro Lys 

355 360 ~ 365 

pro Ala Phe Tyr ser val lie Glu val Ala Asp Glu Met Leu Asn Glu 
370 375 380 

<210> 359 
<211> 2724 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 359 

atgacccggt ctgtacgacc aagagcatgg ggggccggcc tactggccct cgcgatggtg 60 
gcgacagtcg cccccacggc caccggccac agccacgaca cggcagagcc cgtcgtcgtg 120 
gtctacaccg acttcgagaa cgacagcatc gagccgtggg cgcagtccgg cggcccgacg 180 
ctgaacatcg tcgaggtcga cggcgggcac gcgctgcgcg tcggcaacca ccagaacacc 240 
tgggacggca tccagaccca gcccgccacc acgcggatcg agccgggtgt cgagcacacc 300 
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ctgtcgatgc 
ggccgcgacc 
gagagctgga 
ctctatgtct 
ctcatcgagc 
ggattcgaga 
ctcgaccgga 
acctcgcacg 
tacgagatca 
agccaggagc 
gtcacctcga 
ctgttcctct 
gacgtccgca 
gacacgctgg 
gcggacctcc 
gcgtggtacg 
ttcgcggcgg 
ccggacttct 
gcgatcctgc 
tggggcgagt 
tccgacagcg 
gaggagttca 
gctgacgacg 
ggcaagcaga 
gacgccgtgg 
gccctcgagc 
gtcggcaaca 
ttccggctgt 
gacgaccgca 
aagccggcct 
gtgttcgccg 
ctgccgctgc 
ctcacggtgt 
acctacacgg 
aacggctcct 
cagttcgacc 
ctgggcaccc 
cggccgacca 
gacgtccgta 
aacaacacgc 
agcccgtggg 



gcgtccggct 
ccggagccga 
cgaccatccg 
accccgaggt 
gtgctgcccc 
cggacctgga 
ccgacgcgga 
gccacggcct 
gcgcccaggt 
tggtcgtgga 
cagcctggac 
acttcgagac 
tccgtgtcgc 
acgtgcccat 
tgctgctgca 
acgcggcggg 
agaacgacct 
tcttcacgca 
gcgaccgcat 
acggcggcga 
gcgagcacag 
tcgacctggc 
ccaaccaccc 
accggtacgc 
ggcaccagtt 
gcttccagga 
acccgaccga 
tccgtgagtt 
gctggcgcag 
actacggcgc 
aggacatcgc 
accagatcga 
tcgtccacgt 
tctcgtcgga 
ggaccgctgt 
tccggatcat 
tgaccctggt 
tcgacggcga 
tcgagggcgc 
tgttcgtcct 
agcaggactc 



cgtgggcgac 
gaacggctac 
gggaacgtgg 
cacaccggtg 
tgtcgacggc 
cggctgggag 
gtcggccgag 
gcgcctggac 
gaagttcgcc 
cgggggtagc 
gcagatcacc 
gaactggccg 
ccctcgggcg 
gggtgtcgcg 
cttcgaccag 
caacttccgc 
gcgcgtcttc 
cgcggacggc 
gcgcacgcac 
caacccgctc 
cgacggcctg 
gttcatctac 
ggtcacgctc 
cgcgctcatc 
ccacgtcagc 
caccgggctg 
ggcgctgctc 
cacggaggac 
cgctcaggcg 
catcgacgcc 
cctcgacgag 
cggggccggc 
caccgacggt 
cggcgagggc 
ggtccgcgtg 
cgacggcgcc 
cgaggagctg 
gatcgacgcc 
tgctgacggc 
cgcggagatc 
gctc 



ggcacggcga 
cagtggatcg 
ctccctcggg 
gccggcttcg 
ggcgccccgg 
gcacgcgccg 
ggcgactggt 
gtcacggaca 
gggaccggtg 
acctacggca 
acgaactacg 
gacggcatcg 
atcatccagg 
atcgaccagc 
gtcacggccg 
atccacccgc 
ggtcacgtcc 
accccgctga 
atcttcaacg 
gtggcctggg 
cgccgtagcc 
gcgaaccagg 
ttcatcaacg 
gaccgcctca 
ctggccatgc 
atccagggcg 
gtcgagcagg 
ctctactcgg 
ccgctgctgt 
gacctggacg 
gccgcgctga 
gagttccagc 
gacgaggtcg 
gacctggacg 
ccgctcacgg 
accacctccg 
tccttcgtcg 
gtgtgggagg 
gcgaaggccg 
gccgacccgg 



cgacgccggc 
gtaacacgac 
cggacgcgaa 
actacctcct 
gcaccgtcgt 
acggcgtcgg 
ccgcgatcgt 
tcatggacgc 
gtccgggcaa 
ccgtcctcca 
tcacgccgac 
aggacgactt 
aggacctcac 
gtgagacctc 
agaaccacat 
aggcccgcgc 
tggtgtggca 
cctcgagcga 
tcgccgaggc 
acgtcgtcaa 
gctggtacga 
cgttcaacgg 
actacaacac 
tcgagcgcga 
ccatcgcgaa 
tcaccgagct 
gctactacta 
tcaccgtgtg 
tcgacgcggg 
cacgcgtgcg 
cgagccccac 
tccgctgggc 
agatcgtgct 
cggtcaccgc 
ccgagcaggg 
ggtggaacgt 
aggtcgtcga 
acgccaacgt 
agatccggac 
tgatcgacgt 



ccggtggatc 
gatctcgacc 
cgcctcggag 
cgatgacctg 
ctacaccgct 
tgtcggccag 
gaccgaccgc 
gggcgtcacg 
catctggctg 
ggtccctggc 
ggccgaccag 
cctcctcgac 
tccgctgatg 
cggcagcctc 
gaagccggag 
catcatggac 
cggccagacc 
ggccgaccag 
cctctccgag 
cgaggtcgtc 
cgtgctgggc 
tgagttcgcc 
cgagcagtcc 
ggtcccgatc 
cctgcgcggc 
cgacgtcacc 
ccgggacgcc 
gggtctcacc 
cctgcaggcc 
tgcggcctac 
ctgggaccgt 
ggccgaccac 
cggcgacgag 
ggccggggag 
cgacaccgcc 
cgaaggtgtc 
ggcggccgac 
cgtcaccacg 
cctgtgggac 
gacggcctcc 



<210> 360 
<211> 908 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from 

<221> SIGNAL 
<222> (1)...(31) 

<400> 360 

Met Thr Arg ser val 
1 5 
Leu Ala Met Val Ala 
20 

Asp Thr Ala Glu Pro 
35 

Ser lie Glu Pro Trp 
50 

Glu Val Asp Gly Gly 
65 

Trp Asp Gly lie Gin 
85 

val Glu His Thr Leu 
100 

Ala Thr Thr Pro Ala 



an environmental sample. 



Arg Pro 

Thr val 

val val 

Ala Gin 

55 
His Ala 
70 

Thr Gin 
ser Met 
Arg Trp 



Arg Ala Trp Gly Ala Gly Leu 
10 

Thr Ala Thr 



Ala Pro 
25 

Val val 
40 

ser Gly 

Leu Arg 

Pro Ala 

Arg val 
105 
lie Gly 



Gly His 
30 

Tyr Thr Asp Phe Glu 
45 

Leu Asn 



Gly Pro Thr 
60 

Val Gly Asn 
75 

Thr Thr Arg 
90 

Arg Leu Val 

Arg Asp Pro 
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Leu Ala 
15 

Ser His 

Asn Asp 

He val 

Asn Thr 
80 

Pro Gly 
95 

Gly Asp Gly Thr 
110 

Gly Ala Glu Asn 



His Gin 
lie Glu 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2724 
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115 120 125 

Gly Tyr Gin Trp lie Gly Asn Thr Thr lie ser Thr Glu Ser Trp Thr 

130 135 140 

Thr lie Arg Gly Thr Trp Leu Pro Arg Ala Asp Ala Asn Ala ser Glu 
145 150 155 160 

Leu Tyr Val Tyr Pro Glu Val Thr Pro Val Ala Gly Phe Asp Tyr Leu 

165 170 175 

Leu Asp Asp Leu Leu lie Glu Arg Ala Ala Pro Val Asp Gly Gly Ala 

180 185 190 

pro Gly Thr val val Tyr Thr Ala Gly Phe Glu Thr Asp Leu Asp Gly 

195 200 205 

Trp Glu Ala Arg Ala Asp Gly val Gly val Gly Gin Leu Asp Arg Thr 

210 215 220 

Asp Ala Glu Ser Ala Glu Gly Asp Trp Ser Ala lie Val Thr Asp Arg 
225 230 235 240 

Thr Ser His Gly His Gly Leu Arg Leu Asp Val Thr Asp lie Met Asp 

245 250 255 

Ala Gly val Thr Tyr Glu lie ser Ala Gin Val Lys Phe Ala Gly Thr 

260 265 270 

Gly Gly Pro Gly Asn lie Trp Leu Ser Gin Glu Leu val val Asp Gly 

275 280 285 

Gly Ser Thr Tyr Gly Thr Val Leu Gin Val Pro Gly Val Thr ser Thr 

290 295 300 

Ala Trp Thr Gin He Thr Thr Asn Tyr val Thr Pro Thr Ala Asp Gin 
305 310 315 320 

Leu Phe Leu Tyr phe Glu Thr Asn Trp pro Asp Gly lie Glu Asp Asp 

325 330 335 

Phe Leu Leu Asp Asp Val Arg lie Arg Val Ala Pro Arg Ala lie lie 

340 345 350 

Gin Glu Asp Leu Thr Pro Leu Met Asp Thr Leu Asp Val pro Met Gly 

355 360 365 

val Ala lie Asp Gin Arg Glu Thr Ser Gly ser Leu Ala Asp Leu Leu 

370 375 380 

Leu Leu His Phe Asp Gin Val Thr Ala Glu Asn His Met Lys Pro Glu 
385 390 395 400 

Ala Trp Tyr Asp Ala Ala Gly Asn Phe Arg lie His Pro Gin Ala Arg 

405 410 415 

Ala lie Met Asp Phe Ala Ala Glu Asn Asp Leu Arg Val Phe Gly His 

420 425 430 

val Leu val Trp His Gly Gin Thr Pro Asp Phe Phe Phe Thr His Ala 

435 440 445 

Asp Gly Thr Pro Leu Thr Ser ser Glu Ala Asp Gin Ala lie Leu Arg 

450 455 460 

Asp Arg Met Arg Thr His lie Phe Asn Val Ala Glu Ala Leu Ser Glu 
465 470 475 480 

Trp Gly Glu Tyr Gly Gly Asp Asn Pro Leu Val Ala Trp Asp Val Val 

485 490 495 

Asn Glu val val ser Asp Ser Gly Glu His ser Asp Gly Leu Arg Arg 

500 505 510 

Ser Arg Trp Tyr Asp Val Leu Gly Glu Glu Phe He Asp Leu Ala Phe 

515 520 525 . 

lie Tyr Ala Asn Gin Ala Phe Asn Gly Glu Phe Ala Ala Asp Asp Ala 

530 535 540 

Asn His Pro val Thr Leu Phe lie Asn Asp Tyr Asn Thr Glu Gin Ser 
545 550 555 560 

Gly Lys Gin Asn Arg Tyr Ala Ala Leu lie Asp Arg Leu lie Glu Arg 

565 570 575 

Glu Val Pro lie Asp Ala Val Gly His Gin Phe His Val ser Leu Ala 

580 585 590 

Met Pro lie Ala Asn Leu Arg Gly Ala Leu Glu Arg Phe Gin Asp Thr 

595 600 605 

Gly Leu He Gin Gly Val Thr Glu Leu Asp Val Thr Val Gly Asn Asn 

610 615 620 

Pro Thr Glu Ala Leu Leu val Glu Gin Gly Tyr Tyr Tyr Arg Asp Ala 
625 630 635 640 

Phe Arg Leu Phe Arg Glu Phe Thr Glu Asp Leu Tyr ser val Thr val 

645 650 * 655 

Trp Gly Leu Thr Asp Asp Arg Ser Trp Arg Ser Ala Gin Ala Pro Leu 
660 665 670 
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Leu Phe Asp Ala 
675 

Asp Ala Asp Leu 
690 

Asp He Ala Leu 
705 

Leu Pro Leu His 

Ala Ala Asp His 
740 

val Glu He val 
755 

Glu Gly Asp Leu 
770 

Thr Ala Val Val 
785 

Gin Phe Asp Leu 

Val Glu Gly Val 
820 

val Glu val val 
835 

Asp Ala val Trp 
850 

Glu Gly Ala Ala 
865 

Asn Asn Thr Leu 

val Thr Ala ser 
900 

<210> 361 
<211> 5040 
<212> DNA 
<213> Unknown 

<220> 



Gly Leu Gin 

Asp Ala Arg 
695 

Asp Glu Ala 

710 
Gin He Asp 
725 

Leu Thr val 

Leu Gly Asp 

Asp Ala Val 
775 

Arg val Pro 

790 
Arg lie lie 
805 

Leu Gly Thr 

Glu Ala Ala 

Glu Asp Ala 
855 

Asp Gly Ala 

870 
Phe Val Leu 
885 

ser Pro Trp 



Ala Lys 
680 

Val Arg 
Ala Leu 
Gly Ala 



pro Ala Tyr Tyr Gly 
685 

val Phe 



Ala Ala Tyr 
700 

Thr ser Pro 

715 
Gly Glu Phe 
730 

His Val Thr 



Phe val 
745 

Glu Thr Tyr Thr val 
760 

Thr Ala 



Leu Thr 

Asp Gly 

Leu Thr 
825 
Asp Arg 
840 

Asn val 

Lys Ala 

Ala Glu 

Glu Gin 
905 



Ala Gly Glu 
780 

Ala Glu Gin 

795 
Ala Thr Thr 
810 

Leu val Glu 



Thr Trp 

Gin Leu 

Asp Gly 
750 
ser ser 
765 

Asn Gly 
Gly Asp 
Ser Gly 



Glu Leu 
830 

Pro Thr lie Asp Gly 
845 

Asp Val 



Val Thr Thr 
860 
Glu lie Arg 

875 
lie Ala Asp 
890 

Asp Ser Leu 



Thr Leu 
Pro Val 



Ala lie 

Ala Glu 

Asp Arg 
720 
Arg Trp 
735 

ASp Glu 

Asp Gly 

ser Trp 

Thr Ala 
800 
Trp Asn 
815 

ser Phe 

Glu lie 

Arg lie 

Trp Asp 
880 
lie Asp 
895 



<223> Obtained from an environmental sample. 



<400> 361 

atggcaagaa 

gcgatgccat 

ggttttgaaa 

actgatgaaa 

aatggggcat 

ggttacataa 

gaggggattt 

ggctttgttg 

aactttgaac 

atcgaagaag 

gctgaaactc 

ccaggttttc 

gctgaaaata 

gcttccaacg 

ttggtgtggc 

gctatagcca 

ccaggtgtca 

gatcctgaaa 

gattatattg 

tataatgatt 

gagttgcgta 

cagacaccat 

gaagatctac 

gaaagtgcag 

ttaatgcaga 

tctgatcgtg 

ccaaagcatg 

ccagagacgc 

tttaatttgg 



gtaagcgagt 

ccttcgcatc 

cgggattaga 

cgcaagcagg 

cattgccact 

aagcaaaagc 

ctggaaatca 

agtttagagg 

atcaaaatgc 

gtcaagtcaa 

ccttacatga 

gcacagatat 

ttatgaagcc 

atatgatgga 

attctcaatc 

ttatgcatgc 

ttacaggatg 

attggcgctt 

ccatcgcttt 

ataatgataa 

atgaaggcgt 

taaactctat 

cacctattgg 

cccgtttacc 

ttttaagaga 

aatcatggcg 

tctttcatgc 

cagatgctca 

atgcgtatca 



attagcatgg 

aggtgattca 

tggcttcaaa 

cgactattcg 

tacaggcttc 

agatgtagca 

atatccatgg 

tgaactaacc 

tgaagtggaa 

tgacttacca 

gatttgggca 

acgtggtgag 

agatcatttg 

atttgccaga 

cttcccatgg 

ccatattgaa 

ggatgttttg 

gcatttaagg 

taacaaagcc 

tgactatttt 

gcccattcat 

tagaaccagc 

cattagtttc 

tgaagaggta 

caacagcgat 

tgcagatcgt 

tatagccaat 

aacagcctat 

aaattcagaa 



attatgtcta gtgtgcttct 
agccaagtgc caagggttat 
ggacggggta gtgccacctt 
gttcttgtga gcaatcggct 
gttctaccag gtaatacata 
gacaattatg tcatgagtgg 
atatctaatc gtttgttaac 
atactagagg atatgacgtc 
ttttatttag attctgttca 
atgaatgtaa gaagagcgcc 
gatcacttta ctattggcaa 
gtattagccc atcattttaa 
caaagggaac aaggtatttt 
gcaaataatc aagaagtcat 
tttgaagctt taaatccaac 
actgttatgg gacattttaa 
aatgaagcca ttcaaccaag 
gataccaaat ggttacgtgc 
catgaaatgg atccagatgc 
aaagcaacca ttataaaagc 
cgtattggga tgcaaggtca 
gttgagcgtt ttagtgaaat 
acagaaattg atgtaacggt 
gaaattcgcc aagctcagtt 
gtgattcatc gtgttacttt 
catcctaaca tgttagatcc 
ccagaggctt tccttacggc 
gcatctcaag gtcaaccagt 
gtaataccag tggctaatca 
Page 275 



gatatccatg 

atttgaaaca 

aactcgaacg 

tgagcactgg 

tgaatttgtt 

tgagtacaat 

ggttcaagat 

ctttaatcta 

ggttattcta 

acttacactt 

tatttatacg 

tgtgatcaca 

tacttttagt 

tggacatact 

acgtgatgaa 

tgaaaactac 

acagggtcaa 

cattggtgat 

tattctttat 

catggtgcag 

ttataattta 

tactggtcat 

accagggttt 

ttatgctcaa 

ctggggtatg 

tcagtatggt 

ctacccatta 

tgtggggcag 

aatgacagcc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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cataatggcg caacagcggt tgcaagggtg gtttggcacg aagatgctat ttatatttta 1800 

gccaatgtca gcgatgccac accgaatgta gcagcttcgg ctgcccatga gcaagactca 1860 

cttgaggtat ttatttcaaa tacggattca agaatttcta attatatgcc aggtgactat 1920 

caactgagat ttaatcgtgc cggcgtgcat acatatggtt cgactggttc gattgaaggt 1980 

atgacctttg cggtacaaga tggtccaata ggttatcaag ttgaagtgcg tattccctta 2040 

gaaaatgaag tctatgttgg cagaagactt ggttttgact tacaagtcaa tgatgcatgg 2100 

gaagttggcg gtacttctgg ccgacaagca tttgctaaat ggaatgatca cactgacaat 2160 

ggctggcagt ctacagagtt ttggggctgg ttattattac aaggcgatgc ggcacctgtc 2220 

ttacccgttg tattagtgga agaaggcttt gaaacggatt taggttcatt ccaaccaagg 2280 

ggtagtagta cactgactcg aacccaagag gttagtcatg aaggcgacta ttccgtattg 2340 

gttagcaacc gtgtcaacaa ctggaacggt gcgtcattac cgttaacagg cattgttcaa 2400 

ccaggcaaca cttatgagtt tgttggttac attagagcaa aagcagatgt aactggatca 2460 

tatatcatga gtggtgagtt taataatgga tctggtgtat tagaaaacgg tagtattaat 2520 

cgatggccat ggctatcaaa ccgttcatta acgatagcag atggttttgt tgagtttaag 2580 

tcagaactaa ccatacctag tgacatgacg acgtttaact tgaactttga acaccaaaat 2640 

gctgaagtag aattttactt agatgctgtt caagtcactt tgattgcaga agctgatgta 2700 

acaccagtgg acccaccagt agacccacca gtagagccag aaattacagt ggtctattca 2760 

atggtagatg atgcagccat tcaagggatt gaagtgggaa caacaggcac tgctgaagat 2820 

ttttcggata ttagtgaagc tttattagta tctggttcac cagttgttac tgctgtagca 2880 

catccagaag aagcaggaaa gatcggtata gagcttagta atcgagcaga gaattggcat 2940 

gcgctagact ttatgttccc agccataggt gtgcagcggg gtgggagcta tcgatttgtg 3000 

gccagtggcc gtatggcaga aggaacaggt aattcaaatc gtagaatgca gtggaatcaa 3060 

acggatgcgc catggagtga aatatcaggt tctagaacca atgtggcacc tgcagcaacc 3120 

acatggacca ttgacgtgac tttaagtcga ttacagatca acacattatt aaacgctggt 3180 

caaagaggtc ttcgaattca aacgggtaat gcaccaactg tgaccattac cattgacgat 3240 

gtgtttgttt atcagattgg tgacattgac acagcaggtt taccattacc accacaatgg 3300 

aattttgatt tgccaagatt atcagaatta ttcgagccat attttggtct tggtaacatt 3360 

tattcaaccg aaacattaat gaacgctaat gaaacaaaaa gagcattttt acatcacttt 3420 

aacgtgatta cagcagaaaa tggtcataaa ccatccagta ttgcagggcc agaaaatagc 3480 

tttacagtac cagaacctga gcaattcaac tttacggatg cagaccgaat tgtaaacttt 3540 

gctgttgaaa atgacattga attagtagga catgcacttg tatggcattc acaaagtcca 3600 

aactggctgt ttagaagtgc ggctaacaca ccgctaacaa gagcagaagc caaagagcgc 3660 

atggcatatt acatgaaaac tgtttcagag catttcgaag cacaaggtac attaggcgca 3720 

ttttatggtt gggatgttgt gaatgaagcc atcgccagtg gtggtggtac attcgtagat 3780 

caaccaggtc attggcgcac gcaaatgcga acatcatcac catggttcca agcatttaac 3840 

aatggattag atgtagaagc cggtgaacat gccagtgatt atattttcta tgcatactat 3900 

tatgcaagaa agtatttccc aacatcgatc ctatactaca acgattacaa cgatgaaata 3960 

ccaaacaagc gagacaatat cgctcaaatg gtagaagaga taaatgcact ttgggaagca 4020 

catgaagaat atgatggtcg cttactgatt gaatccatcg gtatgcaaag tcattatcac 4080 

atggaaggtt ggacaaccag cgtagacaat gtaagagctg ctttagatcg atacattgca 4140 

acaggtgcaa gagtcagtgt gactgagtta gatatcactt atggtggtca tggtagtaat 4200 

gcatatgcat cacttacacc agaacaatta gcggcacaag cggagcgata tgcagagata 4260 

tttacattgt atttagagcg tgcagatcag ttaagccgtg tatccatctg gggtatgtct 4320 

gatgctaaca gctggagaag ttctggattc ccattactat ttgacagttc acttaatgct 4380 

aaaccagcat ttaatgccat tgtagaatta gttaaaaact gggagacacc aacagttgta 4440 

gcaccagtga ttcaaacaag aacactagca ccattagaaa gtggtgaaag agtctttacc 4500 

atgttagatg tggtaagagg atctaatgca cctgtatggt ttagcataac agacggtgca 4560 

ttaccagaag gtataatcct tcattctaga acaggtattt tagaaggaac accagttgaa 4620 

gatggtcact atagctttac tgtaactgct agaaattacg gcggttcaac aagtcaagcg 4680 

ctgactttaa cagtaggtca tccagtagca ccaccagtaa cgccaccagt aacgccacca 4740 

accgtaatca ttgatgaatc ggatatacca caggctggtc caggccttag ggcaccacag 4800 

attgttgtaa ccgttcaaga aggcagtgaa gtaacgtttg atcttgaaaa attagaagaa 4860 

gttatggcat cactttcaag tcaagtgcca ttggtgttag atgttgaatt ggaagattct 4920 

atcatcacct tggatcaaac attacttaaa cgattaacag acaaggcggc tggaatcgaa 4980 

atacaagcag atggatttag ttatatgctt ccagcagagg tattagaggc aattctttgg 5040 



<210> 362 
<211> 1680 
<212> PRT 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (D...C26) 
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<400> 362 

Met Ala Arg Ser Lys Arg Val Leu Ala Trp lie Met Ser ser Val Leu 
15 10 15 

Leu lie Ser Met Ala Met Pro Ser Phe Ala ser Gly Asp Ser Ser Gin 

20 25 30 

Val Pro Arg val lie Phe Glu Thr Gly Phe Glu Thr Gly Leu Asp Gly 

35 40 45 

Phe Lys Gly Arg Gly ser Ala Thr Leu Thr Arg Thr Thr Asp Glu Thr 

50 55 60 

Gin Ala Gly Asp Tyr Ser Val Leu val Ser Asn Arg Leu Glu His Trp 
65 70 75 80 

Asn Gly Ala Ser Leu Pro Leu Thr Gly Phe Val Leu Pro Gly Asn Thr 

85 90 95 

Tyr Glu Phe val Gly Tyr lie Lys Ala Lys Ala Asp val Ala Asp Asn 

100 105 110 

Tyr Val Met Ser Gly Glu Tyr Asn Glu Gly He Ser Gly Asn Gin Tyr 

115 120 125 

Pro Trp lie Ser Asn Arg Leu Leu Thr Val Gin Asp Gly Phe Val Glu 

130 135 140 

Phe Arg Gly Glu Leu Thr He Leu Glu Asp Met Thr Ser Phe Asn Leu 
145 150 155 160 

Asn Phe Glu His Gin Asn Ala Glu val Glu Phe Tyr Leu Asp Ser Val 

165 170 175 

Gin Val lie Leu lie Glu Glu Gly Gin Val Asn Asp Leu Pro Met Asn 

180 185 190 

Val Arg Arg Ala Pro Leu Thr Leu Ala Glu Thr Pro Leu His Glu lie 

195 200 205 

Trp Ala Asp His Phe Thr lie Gly Asn lie Tyr Thr pro Gly Phe Arg 

210 215 220 

Thr Asp He Arg Gly Glu Val Leu Ala His His Phe Asn Val lie Thr 
225 230 235 240 

Ala Glu Asn lie Met Lys Pro Asp His Leu Gin Arg Glu Gin Gly He 

245 250 255 

Phe Thr Phe Ser Ala Ser Asn Asp Met Met Glu Phe Ala Arg Ala Asn 

260 265 270 

Asn Gin Glu Val lie Gly His Thr Leu val Trp His ser Gin Ser Phe 

275 280 285 

Pro Trp Phe Glu Ala Leu Asn Pro Thr Arg Asp Glu Ala lie Ala lie 

290 295 300 

Met His Ala His He Glu Thr Val Met Gly His Phe Asn Glu Asn Tyr 
305 310 315 320 

Pro Gly Val lie Thr Gly Trp Asp Val Leu Asn Glu Ala He Gin Pro 

325 330 335 

Arg Gin Gly Gin Asp Pro Glu Asn Trp Arg Leu His Leu Arg Asp Thr 

340 345 350 

Lys Trp Leu Arg Ala lie Gly Asp Asp Tyr He Ala He Ala Phe Asn 

355 360 365 

Lys Ala His Glu Met Asp Pro Asp Ala He Leu Tyr Tyr Asn Asp Tyr 

370 375 380 

Asn Asp Asn Asp Tyr Phe Lys Ala Thr He He Lys Ala Met Val Gin 
385 390 395 400 

Glu Leu Arg Asn Glu Gly Val Pro lie His Arg He Gly Met Gin Gly 

405 410 415 

His Tyr Asn Leu Gin Thr Pro Leu Asn ser He Arg Thr ser val Glu 

420 425 ^ 430 

Arg Phe ser Glu He Thr Gly His Glu Asp Leu Pro Pro He Gly He 

435 440 445 

Ser Phe Thr Glu lie Asp val Thr Val Pro Gly Phe Glu Ser Ala Ala 

450 455 460 

Arg Leu Pro Glu Glu val Glu lie Arg Gin Ala Gin Phe Tyr Ala Gin 
465 470 475 480 

Leu Met Gin lie Leu Arg Asp Asn ser Asp Val He His Arg val Thr 

485 490 495 

Phe Trp Gly Met Ser Asp Arg Glu Ser Trp Arg Ala Asp Arg His Pro 

500 y 505 510 

Asn Met Leu Asp Pro Gin Tyr Gly Pro Lys His val Phe His Ala He 

515 520 525 

Ala Asn Pro Glu Ala Phe Leu Thr Ala Tyr Pro Leu Pro Glu Thr Pro 
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530 535 540 

Asp Ala Gin Thr Ala Tyr Ala Ser Gin Gly Gin Pro val Val Gly Gin 
545 550 555 560 

Phe Asn Leu Asp Ala Tyr Gin Asn Ser Glu Val lie Pro val Ala Asn 

565 570 575 

Gin Met Thr Ala His Asn Gly Ala Thr Ala Val Ala Arg val Val Trp 

580 585 590 

His Glu Asp Ala lie Tyr He Leu Ala Asn Val Ser Asp Ala Thr Pro 

595 600 605 

Asn Val Ala Ala Ser Ala Ala His Glu Gin Asp Ser Leu Glu Val Phe 

610 615 620 

lie Ser Asn Thr Asp Ser Arg lie Ser Asn Tyr Met Pro Gly Asp Tyr 
625 630 635 640 

Gin Leu Arg Phe Asn Arg Ala Gly Val His Thr Tyr Gly Ser Thr Gly 

645 650 655 

Ser lie Glu Gly Met Thr Phe Ala Val Gin Asp Gly Pro He Gly Tyr 

660 665 670 

Gin val Glu val Arg He Pro Leu Glu Asn Glu Val Tyr val Gly Arg 

675 680 685 

Arg Leu Gly Phe Asp Leu Gin val Asn Asp Ala Trp Glu Val Gly Gly 

690 695 700 

Thr Ser Gly Arg Gin Ala Phe Ala Lys Trp Asn Asp His Thr Asp Asn 
705 710 715 720 

Gly Trp Gin Ser Thr Glu Phe Trp Gly Trp Leu Leu Leu Gin Gly Asp 

725 730 735 

Ala Ala Pro val Leu Pro val val Leu val Glu Glu Gly Phe Glu Thr 

740 745 750 

Asp Leu Gly Ser Phe Gin Pro Arg Gly ser Ser Thr Leu Thr Arg Thr 

755 760 765 

Gin Glu Val Ser His Glu Gly Asp Tyr ser Val Leu Val ser Asn Arg 

770 775 780 

val Asn Asn Trp Asn Gly Ala ser Leu Pro Leu Thr Gly lie Val Gin 
785 790 795 800 

Pro Gly Asn Thr Tyr Glu Phe val Gly Tyr He Arg Ala Lys Ala Asp 

805 810 815 

Val Thr Gly Ser Tyr lie Met ser Gly Glu Phe Asn Asn Gly ser Gly 

820 825 830 

Val Leu Glu Asn Gly ser He Asn Arg Trp Pro Trp Leu ser Asn Arg 

835 840 845 

Ser Leu Thr lie Ala Asp Gly Phe val Glu Phe Lys Ser Glu Leu Thr 

850 855 860 

lie pro Ser Asp Met Thr Thr Phe Asn Leu Asn Phe Glu His Gin Asn 
865 870 875 880 

Ala Glu Val Glu Phe Tyr Leu Asp Ala Val Gin Val Thr Leu He Ala 

885 890 895 

Glu Ala Asp Val Thr Pro Val Asp Pro Pro Val Asp Pro Pro Val Glu 

900 905 910 

pro Glu lie Thr val val Tyr ser Met Val Asp Asp Ala Ala lie Gin 

915 920 925 

Gly lie Glu val Gly Thr Thr Gly Thr Ala Glu Asp Phe ser Asp He 

930 935 940 

Ser Glu Ala Leu Leu val Ser Gly ser pro Val val Thr Ala Val Ala 
945 950 955 960 

His pro Glu Glu Ala Gly Lys lie Gly He Glu Leu ser Asn Arg Ala 

965 970 975 

Glu Asn Trp His Ala Leu Asp Phe Met Phe Pro Ala lie Gly Val Gin 

980 985 990 

Arg Gly Gly Ser Tyr Arg phe Val Ala ser Gly Arg Met Ala Glu Gly 

995 1000 1005 

Thr Gly Asn Ser Asn Arg Arg Met Gin Trp Asn Gin Thr Asp Ala Pro 

1010 1015 1020 

Trp Ser Glu He Ser Gly ser Arg Thr Asn Val Ala pro Ala Ala Thr 
1025 1030 1035 1040 

Thr Trp Thr He Asp val Thr Leu ser Arg Leu Gin He Asn Thr Leu 

, , 1045 1050 1055 

Leu Asn Ala Gly Gin Arg Gly Leu Arg lie Gin Thr Gly Asn Ala Pro 

1060 1065 1070 

Thr val Thr He Thr He Asp Asp Val Phe Val Tyr Gin He Gly Asp 
1075 1080 1085 
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lie Asp Thr Ala Gly Leu Pro Leu Pro Pro Gin Trp Asn Phe Asp Leu 

1090 1095 1100 

pro Arg Leu Ser Glu Leu Phe Glu Pro Tyr Phe Gly Leu Gly Asn He 
1105 1110 1115 1120 

Tyr Ser Thr Glu Thr Leu Met Asn Ala Asn Glu Thr Lys Arg Ala Phe 

1125 1130 1135 

Leu His His Phe Asn val lie Thr Ala Glu Asn Gly His Lys Pro ser 

1140 1145 1150 

Ser lie Ala Gly Pro Glu Asn Ser Phe Thr val Pro Glu Pro Glu Gin 

1155 1160 1165 

Phe Asn Phe Thr Asp Ala Asp Arg lie Val Asn Phe Ala Val Glu Asn 

1170 1175 1180 

Asp He Glu Leu val Gly His Ala Leu Val Trp His Ser Gin Ser Pro 
1185 1190 1195 1200 

Asn Trp Leu Phe Arg ser Ala Ala Asn Thr Pro Leu Thr Arg Ala Glu 

1205 1210 1215 

Ala Lys Glu Arg Met Ala Tyr Tyr Met Lys Thr Val Ser Glu His Phe 

1220 1225 1230 

Glu Ala Gin Gly Thr Leu Gly Ala Phe Tyr Gly Trp Asp Val Val Asn 

1235 1240 1245 

Glu Ala lie Ala Ser Gly Gly Gly Thr Phe Val Asp Gin Pro Gly His 

1250 1255 1260 

Trp Arg Thr Gin Met Arg Thr Ser ser pro Trp Phe Gin Ala Phe Asn 
1265 1270 1275 1280 

Asn Gly Leu Asp Val Glu Ala Gly Glu His Ala Ser Asp Tyr lie Phe 

1285 1290 1295 

Tyr Ala Tyr Tyr Tyr Ala Arg Lys Tyr Phe Pro Thr ser lie Leu Tyr 

1300 1305 1310 

Tyr Asn Asp Tyr Asn Asp Glu lie Pro Asn Lys Arg Asp Asn lie Ala 

1315 1320 1325 

Gin Met val Glu Glu lie Asn Ala Leu Trp Glu Ala His Glu Glu Tyr 

1330 1335 1340 

Asp Gly Arg Leu Leu lie Glu Ser lie Gly Met Gin Ser His Tyr His 
1345 1350 1355 1360 

Met Glu Gly Trp Thr Thr ser Val Asp Asn Val Arg Ala Ala Leu Asp 

1365 1370 1375 

Arg Tyr lie Ala Thr Gly Ala Arg val ser val Thr Glu Leu Asp lie 

1380 1385 1390 

Thr Tyr Gly Gly His Gly ser Asn Ala Tyr Ala ser Leu Thr Pro Glu 

1395 1400 1405 

Gin Leu Ala Ala Gin Ala Glu Arg Tyr Ala Glu He Phe Thr Leu Tyr 

1410 1415 1420 

Leu Glu Arg Ala Asp Gin Leu Ser Arg Val Ser lie Trp Gly Met Ser 
1425 1430 ' 1435 1440 

Asp Ala Asn Ser Trp Arg ser ser Gly phe Pro Leu Leu Phe Asp ser 

1445 1450 1455 

ser Leu Asn Ala Lys Pro Ala Phe Asn Ala lie Val Glu Leu Val Lys 

1460 1465 1470 

Asn Trp Glu Thr Pro Thr Val Val Ala Pro Val lie Gin Thr Arg Thr 

1475 1480 1485 

Leu Ala Pro Leu Glu ser Gly Glu Arg val Phe Thr Met Leu Asp Val 

1490 1495 1500 

val Arg Gly ser Asn Ala Pro val Trp Phe ser lie Thr Asp Gly Ala 
1505 1510 1515 1520 

Leu Pro Glu Gly lie lie Leu His ser Arg Thr Gly lie Leu Glu Gly 

1525 1530 1535 

Thr Pro Val Glu Asp Gly His Tyr ser Phe Thr Val Thr Ala Arg Asn 

1540 1545 1550 

Tyr Gly Gly Ser Thr ser Gin Ala Leu Thr Leu Thr Val Gly His Pro 

1555 1560 1565 

val Ala Pro Pro val Thr Pro Pro val Thr pro Pro Thr val lie lie 

1570 1575 1580 

Asp Glu ser Asp lie Pro Gin Ala Gly Pro Gly Leu Arg Ala Pro Gin 
1585 1590 1595 1600 

lie val Val Thr val Gin Glu Gly ser Glu Val Thr Phe Asp Leu Glu 

1605 1610 1615 

Lys Leu Glu Glu val Met Ala ser Leu ser ser Gin val Pro Leu Val 

1620 1625 1630 

Leu Asp Val Glu Leu Glu Asp Ser lie lie Thr Leu Asp Gin Thr Leu 
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1635 1640 1645 

Leu Lys Arg Leu Thr Asp Lys Ala Ala Gly lie Glu lie Gin Ala Asp 

1650 1655 * 1660 

Gly Phe Ser Tyr Met Leu Pro Ala Glu Val Leu Glu Ala lie Leu Trp 
1665 1670 1675 1680 

<210> 363 
<211> 1317 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 363 

gtgaccaccc gcgcacaggt tcttgatcag gcgctggcac tgggccaccg gtgtggctgg 
gaaaaactca gcctggacgc catcgccagg gcgctgggcc gttttggcag cctggctctg 
tgtgtcgcgc tgttgagcgc ctgcggcagt agtagtagct ccctggatga tccgggtgct 
ggcagcagtt cttccagctc tgagagcagc caaagctcca gcgccagttc ccaggctgat 
ggcgacggta cccaggacag cctctacgcc caggcggact tccctgtagg ggttgcggtg 
caggtggcca attgggagcc tttcagcctg tttaccgcgc ccgatgccgc tgcgcgtcag 
aacctggttg cccgacactt ctccgaagtg accgcgacca acgtcatgaa aatgtcctat 
atgcgcacca acagtggtgg ttttaccgac gcgccggcgc gtccgctgat tgattttgcc 
cgcgccaatg gcatcaaagt gcacggtcac gcactggtct ggcatgcgga ttatcaggtg 
ccaaatgtgt ttcgtgacta cgaaggggac aattggcagg ggcttttaac cgagcatgtc 
gagggcgtta tggggctgtt tgacgacacc gtggtaagtt gggatgtcgt aaacgaagcg 
gttgataccg gctcacctga cggctggcgc cggtcgattt tctataattt tgcgccgccg 
gaagcagggc aggtgccgga atatattgaa gtggcttacc aggccgctcg agaggccaat 
ccggaagtga ccctctacta caacgatttt gacaacacgg ccaataccgg gcgcctcaac 
aagaccctgg aaattgccga tcgcctgaaa gagctggacg cgatcgacgg tatcgggttc 
cagatgcacg cctatatgaa ctacccgagt attgcgcagt ttcgcaatgc ctttcaggaa 
gtggtcgatc gtgacctgaa agtcaaagtc accgagctgg acattgccat cgtcaaccct 
tacggcagct cgacgcctcc gccgctgccg gagtttgatc aggcgctggc cgacgcccaa 
ggtgtccgtt actgccagat tgccgaggcc tatctggatg tcgttcctgc cgagctgcgg 
ggtggtttca ccgtctgggg cctgaccgat gacgacagct ggctgatggg agcgttcgcg 
tccgcaaccg gcgcccaata cgaccaggtc tatccggtgt tgtttgacga taatctgcaa 
gccaagcccg cgttctttgg cgtcaagcgc gccctccgcg gcgaaccctg cgagtaa 

<210> 364 
<211> 438 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 364 

Met Thr Thr Arg Ala Gin val Leu Asp Gin Ala Leu Ala Leu Gly His 
1 5 10 15 

Arg cys Gly Trp Glu Lys Leu Ser Leu Asp Ala lie Ala Arg Ala Leu 

20 25 30 

Gly Arg Phe Gly Ser Leu Ala Leu Cys Val Ala Leu Leu ser Ala cys 

35 40 45 

Gly ser ser ser Ser ser Leu Asp Asp Pro Gly Ala Gly ser Ser Ser 

50 55 60 

ser ser ser Glu ser Ser Gin ser ser ser Ala Ser ser Gin Ala Asp 
65 70 75 80 

Gly Asp Gly Thr Gin Asp Ser Leu Tyr Ala Gin Ala Asp Phe Pro val 

85 90 95 

Gly val Ala val Gin Val Ala Asn Trp Glu Pro Phe ser Leu Phe Thr 

100 105 110 

Ala Pro Asp Ala Ala Ala Arg Gin Asn Leu Val Ala Arg His Phe Ser 

115 120 125 

Glu val Thr Ala Thr Asn val Met Lys Met ser Tyr Met Arg Thr Asn 

130 135 140 

Ser Gly Gly Phe Thr Asp Ala Pro Ala Arg Pro Leu lie Asp Phe Ala 
145 150 155 160 

Arg Ala Asn Gly lie Lys Val His Gly His Ala Leu Val Trp His Ala 
165 170 175 
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Asp Tyr Gin Val Pro Asn Val Phe Arg Asp Tyr Glu Gly Asp Asn Trp 

180 185 190 

Gin Gly Leu Leu Thr Glu His Val Glu Gly val Met Gly Leu Phe Asp 

195 200 205 

Asp Thr Val Val Ser Trp Asp Val Val Asn Glu Ala Val Asp Thr Gly 

210 215 220 

Ser Pro Asp Gly Trp Arg Arg Ser lie Phe Tyr Asn Phe Ala Pro Pro 
225 230 235 240 

Glu Ala Gly Gin Val Pro Glu Tyr lie Glu val Ala Tyr Gin Ala Ala 

245 250 255 

Arg Glu Ala Asn Pro Glu val Thr Leu Tyr Tyr Asn Asp Phe Asp Asn 

260 265 270 

Thr Ala Asn Thr Gly Arg Leu Asn Lys Thr Leu Glu lie Ala Asp Arg 

275 280 285 

Leu Lys Glu Leu Asp Ala lie Asp Gly lie Gly Phe Gin Met His Ala 

290 295 300 

Tyr Met Asn Tyr Pro Ser lie Ala Gin Phe Arg Asn Ala Phe Gin Glu 
305 310 315 320 

val val Asp Arg Asp Leu Lys val Lys Val Thr Glu Leu Asp lie Ala 

325 330 335 

lie Val Asn Pro Tyr Gly Ser Ser Thr Pro Pro Pro Leu Pro Glu Phe 

340 345 350 

Asp Gin Ala Leu Ala Asp Ala Gin Gly Val Arg Tyr cys Gin lie Ala 

355 360 365 

Glu Ala Tyr Leu Asp Val val Pro Ala Glu Leu Arg Gly Gly Phe Thr 

370 375 380 

val Trp Gly Leu Thr Asp Asp Asp ser Trp Leu Met Gly Ala Phe Ala 
385 390 395 400 

ser Ala Thr Gly Ala Gin Tyr Asp Gin val Tyr Pro Val Leu Phe Asp 

405 410 415 

Asp Asn Leu Gin Ala Lys Pro Ala Phe Phe Gly Val Lys Arg Ala Leu 

420 425 430 

Arg Gly Glu Pro Cys Glu 
435 

<210> 365 
<211> 3246 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 365 

atgaaccact tcgcttcaaa atcgctgcgc atggcgtggc aacccggact gcttgcgaca 60 

accgtgctgc cgttggcggc tgccgcccca ataccagcgc cgaatacgga taccaaagtg 120 

agcaatactt cgtccatcac cactccggct gcggctccac agtcgcagcc acaaccaacg 180 

caagacgcaa acgctcccgc accgcttaaa gcggctttcc gggataagtt tctcatcggc 240 

gcggtgctga gtgacgctgc gctgcgaggc agtgcgcccg acaaggtggc gatagccacc 300 

acgcacttta acgcgctcac cgccgaaaac gccatgaagc cagacgcgat gcaaccgcgc 360 

gaagggcagt tcaacttcgc tgcaggcgat cggctcgtcg aactcgccga aaaaagcggc 420 

ggtgtgccca tcggccacac gctggtgtgg cacgcgcaaa caccgaagtg gttttttgaa 480 

gggccggatg gacagcccgc gacgcgcgaa ctggctttgg agcgcatgcg caaacacatt 540 

tccactgtgg tggggcgcta caaagggcgc atcaaggagt gggatgtggt gaacgaagcc 600 

atcaacgacg gacccggtgt gctgcgtccc tctccctggc tcaaagccat cggcgaagat 660 

tacatcgccg aagccttccg cgccgcgcac gccgccgacc ccgacgcgat tttgatttat 720 

aacgattaca acatcgaact gggctacaaa cggcccaaag cgctgcaact cctaaaatcg 780 

ctcattgacc agaaagtgcc gattcacgcc gtgggcattc agggtcactg gcgcatggac 840 

aacccgaact tcgccgaagt ggaacaggcc atcaaagagt tttcggcgct ggggttgaaa 900 

gtcatgatca ccgaactcga catcggcgtg ctgccgacgc gttatcaggg cgcggatatt 960 

tcagcgaccg aaaccatgac gcccgaacag cgcgccgtga tgaaccccta tacggacgga 1020 

ttgccggacg atgtggcgca aaagcacgcc gagcgctatc gccaggcgtt tgagatgttc 1080 

ctgcggcaca aagacaaaat cagtcgtgtg acattttggg gtgtggacga cggcacttcg 1140 

tggctgaacg gtttcccggt gcgcggccgc accgattatc cgctgctatt tgatcgtcag 1200 

ggcaagccaa aacccgcctt tttcgcggtg caaaacgcgg cgatgggcgc aacagcgcaa 1260 

ccgagcgcca gcgctcccgc aacgcatggc gccgctcctg catccaccaa cattcgcggc 1320 

gccgagtttc ctcgcgtgga aagcgacggg cgggtgacgt ttcgcatcaa agcgcctgac 1380 

gcgcaaaaag tgcaatttga tttaggtaag ccttacgacg ccacccgcga cgccgagggc 1440 

aactggacgg cgaccacaga gccacaagtg cccggcttcc attattattt tttgattgtc 1500 
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cccggcgagc gaaacctttt acggtgcggg ccgccagatg 1560 

tcccgacagc gcgttttatt cgccgcaaaa cgtgccgcat 1620 

gtatttttcc aacaccacgc aggcgtggcg gcgcatcttc 1680 

cgacaccgat caggccatgc gttttcctgt gctgtatttg 1740 

cgaacgcggc tggcccaatc aggggcgcgt gagctttatc 1800 

gggcaaagcc aaaccgatgc tggtggtgat ggagcaaggc 1860 

accgcaggtg ccgctgcgcc cgcccggaag caacgccgga 1920 

tcgcatgttc gccacgctgg gcgaagtgtt caccaaagac 1980 

aaattaccgc accaaaaccg agcgcgaaaa ccgcgcgatg 2040 

aatgcaaagt ttcatcatcg gcctggcgaa caccgatcta 2100 

cagcggcgcg ggtggtggtt ttggcggcgg cgccttcgac 2160 

tgtgatggcc gatgccgatg ccttcaacaa aaaagttcgc 2220 

cactgccgag aacgagcgtt ttcagagcag cgtgcgcggt 2280 

agcgggcatc aaaaccacgt tctacgaatc gcccggcact 2340 

gcgccgcagc ctgcgcgaat tcgcgccgct cttgtttcaa 2400 

gcgcggcccc aatgcccgcc cgattgcgcc gcagccgatt 2460 

gcccgccttc cctccggcgc cctccggttt cgatgcgcgg 2520 

cgaaattaaa cttgtggaat acccttctgc cacggtcggc 2580 

ctatacgccg ccgggctaca acccgcaaga agaatatccc 2640 

catcggcggc gacgagtggg aatggaaaaa tggcggcacg 2700 

cctctacgct gagaagaaac tccagccgat gatcgtggtg 2760 

aaaagacgac cgtcctatcg gcaacgtgtt cgcttccgct 2820 

gaaagatttg ctgaacgaca ttatcccctt tgttgagaag 2880 

cccgcaaaat cgcgctttgg ccggtctttc gatgggcggc 2940 

cctcggcaac ctcgacacct tcgcgtgggt tggcggcttt 3000 

cagcggcgca agtctactgg ccaatcccga cgacgccaaa 3060 

ggtttcgtgc ggcgataaag acaatttgat gtttatcagc 3120 

tgccgagaat aacgtgccgc acatctggca tgtacagccc 3180 

gtggaagcaa gacctgtata acttcgccca actgctattc 3240 

3246 

<210> 366 

<211> 1081 

<212> PRT 

<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)...(65) 

<400> 366 

Met Asn His Phe Ala ser Lys Ser Leu Arg Met Ala Trp Gin Pro Gly 

1 5 10 15 

Leu Leu Ala Thr Thr Val Leu Pro Leu Ala Ala Ala Ala Pro lie Pro 

20 25 30 

Ala Pro Asn Thr Asp Thr Lys Val Ser Asn Thr Ser ser lie Thr Thr 

35 40 45 

pro Ala Ala Ala Pro Gin Ser Gin Pro Gin Pro Thr Gin Asp Ala Asn 

50 55 60 

Ala Pro Ala Pro Leu Lys Ala Ala Phe Arg Asp Lys Phe Leu rle Gly 
65 70 75 80 

Ala Val Leu Ser Asp Ala Ala Leu Arg Gly ser Ala Pro Asp Lys Val 

85 90 95 

Ala He Ala Thr Thr His Phe Asn Ala Leu Thr Ala Glu Asn Ala Met 

100 105 110 

Lys Pro Asp Ala Met Gin Pro Arg Glu Gly Gin Phe Asn Phe Ala Ala 

115 120 125 

Gly Asp Arg Leu Val Glu Leu Ala Glu Lys Ser Gly Gly Val Pro lie 

130 135 140 

Gly His Thr Leu Val Trp His Ala Gin Thr Pro Lys Trp Phe Phe Glu 
145 150 155 160 

Gly Pro Asp Gly Gin Pro Ala Thr Arg Glu Leu Ala Leu Glu Arg Met 

165 170 175 

Arg Lys His lie ser Thr Val val Gly Arg Tyr Lys Gly Arg He Lys 

180 185 190 

Glu Trp Asp Val val Asn Glu Ala lie Asn Asp Gly Pro Gly Val Leu 
195 200 205 
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gatggagtgc gcgtggccga 
agcggcatcg aaattcccga 
ggcgaagtgc gcgaacgctg 
atttatacgc cgccgggtta 
cagcacggcg gtggcgaaga 
atggacaatc tcatcgcgca 
tatgcgcgca agcccgatga 
gcgatgccgc ccgactttaa 
ctgattccgt ttattgacgc 
gccggacttt cgatgggtgg 
ttcgcgcacc tcggcggttt 
gccaaaaccg cgcacggcgg 
acgatgtttc tcagcatcgg 
taccgcgacg cgctgaccaa 
tcgcacgagt ggctgacatg 
gaggccaaca cgcagatcga 
gttcttggcc cgggcgacaa 
cgcgatggca ttccgcacgg 
accacgcgca agatgcaggt 
gtgctctatt tgctgcacgg 
cccgaagtga ttctcgacaa 
atgcccaatg ggcgcgcgca 
ccggcgtttg cgacgtttga 
aattacccga ccaaaaccgg 
gggcaatctc tcaactttgg 
tcgtccgcgc ccaacacgcg 
aagaagctga agctgctgtg 
cagcgcacgc accgttatct 
ggcggacacg acttcaaggt 
cgttaa 
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Arg pro ser Pro Trp Leu Lys Ala lie Gly Glu Asp Tyr lie Ala Glu 

210 215 220 

Ala Phe Arg Ala Ala His Ala Ala Asp Pro Asp Ala lie Leu lie Tyr 
225 230 235 240 

Asn Asp Tyr Asn lie Glu Leu Gly Tyr Lys Arg Pro Lys Ala Leu Gin 

245 250 255 

Leu Leu Lys ser Leu lie Asp Gin Lys val Pro He His Ala val Gly 

260 265 270 

lie Gin Gly His Trp Arg Met Asp Asn pro Asn Phe Ala Glu Val Glu 

275 " 280 285 

Gin Ala He Lys Glu Phe ser Ala Leu Gly Leu Lys Val Met lie Thr 

290 295 300 

Glu Leu Asp lie Gly Val Leu Pro Thr Arg Tyr Gin Gly Ala Asp He 
305 310 315 320 

Ser Ala Thr Glu Thr Met Thr Pro Glu Gin Arg Ala Val Met Asn Pro 

325 330 335 

Tyr Thr Asp Gly Leu pro Asp Asp Val Ala Gin Lys His Ala Glu Arg 

340 345 350 

Tyr Arg Gin Ala Phe Glu Met Phe Leu Arg His Lys Asp Lys lie Ser 

355 360 " 365 

Arg val Thr Phe Trp Gly Val Asp Asp Gly Thr Ser Trp Leu Asn Gly 

370 375 380 

Phe Pro val Arg Gly Arg Thr Asp Tyr Pro Leu Leu phe Asp Arg Gin 
385 390 395 400 

Gly Lys Pro Lys Pro Ala Phe Phe Ala val Gin Asn Ala Ala Met Gly 

405 410 415 

Ala Thr Ala Gin Pro Ser Ala Ser Ala Pro Ala Thr His Gly Ala Ala 

420 425 430 

Pro Ala ser Thr Asn lie Arg Gly Ala Glu Phe Pro Arg val Glu Ser 

435 440 445 

Asp Gly Arg val Thr Phe Arg lie Lys Ala Pro Asp Ala Gin Lys val 

450 455 460 

Gin Phe Asp Leu Gly Lys Pro Tyr Asp Ala Thr Arg Asp Ala Glu Gly 
465 470 475 480 

Asn Trp Thr Ala Thr Thr Glu Pro Gin Val Pro Gly Phe His Tyr Tyr 

485 490 495 

Phe Leu lie Val Asp Gly Val Arg Val Ala Asp Pro Ala ser Glu Thr 

500 505 510 

Phe Tyr Gly Ala Gly Arg Gin Met ser Gly lie Glu lie Pro Asp Pro 

515 ~ 520 525 

Asp ser Ala Phe Tyr Ser Pro Gin Asn Val Pro His Gly Glu Val Arg 

530 535 540 

Glu Arg Trp Tyr Phe Ser Asn Thr Thr Gin Ala Trp Arg Arg lie Phe 
545 550 555 560 

lie Tyr Thr Pro Pro Gly Tyr Asp Thr Asp Gin Ala Met Arg Phe Pro 

565 570 575 

val Leu Tyr Leu Gin His Gly Gly Gly Glu Asp Glu Arg Gly Trp Pro 

580 585 590 

Asn Gin Gly Arg Val ser Phe lie Met Asp Asn Leu lie Ala Gin Gly 

595 ~ 600 605 

Lys Ala Lys Pro Met Leu Val Val Met Glu Gin Gly Tyr Ala Arg Lys 

610 615 620 

pro Asp Glu Pro Gin val Pro Leu Arg Pro Pro Gly ser Asn Ala Gly 
625 630 635 640 

Ala Met Pro Pro Asp Phe Asn Arg Met Phe Ala Thr Leu Gly Glu Val 

645 ~ 650 655 

Phe Thr Lys Asp Leu lie Pro Phe lie Asp Ala Asn Tyr Arg Thr Lys 

660 665 670 

Thr Glu Arg Glu Asn Arg Ala Met Ala Gly Leu ser Met Gly Gly Met 

675 680 685 

Gin ser Phe lie lie Gly Leu Ala Asn Thr Asp Leu Phe Ala His Leu 

690 695 700 

Gly Gly Phe Ser Gly Ala Gly Gly Gly phe Gly Gly Gly Ala Phe Asp 
705 710 715 720 

Ala Lys Thr Ala His Gly Gly val Met Ala Asp Ala Asp Ala Phe Asn 

725 730 735 

Lys Lys Val Arg Thr Met phe Leu Ser He Gly Thr Ala Glu Asn Glu 

740 745 750 

Arg Phe Gin ser Ser Val Arg Gly Tyr Arg Asp Ala Leu Thr Lys Ala 
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755 760 765 

Gly lie Lys Thr Thr Phe Tyr Glu Ser Pro Gly Thr Ser His Glu Trp 

770 775 780 

Leu Thr Trp Arg Arg Ser Leu Arg Glu phe Ala Pro Leu Leu phe Gin 
785 790 795 800 

Glu Ala Asn Thr Gin lie Glu Arg Gly Pro Asn Ala Arg Pro lie Ala 

805 810 815 

Pro Gin Pro lie Val Leu Gly Pro Gly Asp Lys Pro Ala Phe Pro Pro 

820 825 830 

Ala Pro ser Gly Phe Asp Ala Arg Arg Asp Gly He Pro His Gly Glu 

835 840 845 

lie Lys Leu Val Glu Tyr Pro Ser Ala Thr Val Gly Thr Thr Arg Lys 

850 855 860 

Met Gin Val Tyr Thr Pro Pro Gly Tyr Asn Pro Gin Glu Glu Tyr Pro 
865 870 875 880 

Val Leu Tyr Leu Leu His Gly lie Gly Gly Asp Glu Trp Glu Trp Lys 

885 890 895 

Asn Gly Gly Thr Pro Glu val lie Leu Asp Asn Leu Tyr Ala Glu Lys 

900 905 910 

Lys Leu Gin Pro Met lie Val Val Met Pro Asn Gly Arg Ala Gin Lys 

915 920 925 

Asp Asp Arg Pro lie Gly Asn Val Phe Ala ser Ala Pro Ala phe Ala 

930 935 940 

Thr Phe Glu Lys Asp Leu Leu Asn Asp He lie Pro Phe Val Glu Lys 
945 950 955 960 

Asn Tyr Pro Thr Lys Thr Gly Pro Gin Asn Arg Ala Leu Ala Gly Leu 

965 970 975 

ser Met Gly Gly Gly Gin ser Leu Asn phe Gly Leu Gly Asn Leu Asp 

980 985 990 

Thr Phe Ala Trp Val Gly Gly Phe Ser ser Ala Pro Asn Thr Arg Ser 

995 1000 1005 

Gly Ala ser Leu Leu Ala Asn Pro Asp Asp Ala Lys Lys Lys Leu Lys 

1010 1015 1020 

Leu Leu Trp Val Ser cys Gly Asp Lys Asp Asn Leu Met Phe lie ser 
1025 1030 1035 1040 

Gin Arg Thr His Arg Tyr Leu Ala Glu Asn Asn Val Pro His lie Trp 

1045 1050 1055 

His Val Gin Pro Gly Gly His Asp Phe Lys Val Trp Lys Gin Asp Leu 

1060 1065 1070 

Tyr Asn phe Ala Gin Leu Leu Phe Arg 
1075 1080 

<210> 367 
<211> 1338 
<212> DNA 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 367 

atgaaaagaa ttggattact atttatggcg ttggcgctaa ccgcatttat ggcgcagcat 60 

tcgtccgctc aaaggatttg caataaccaa acagggaccc atggtggatt ctactacaca 120 

tggtggagtg atgggggtgg atctgcatgt ataacaatgg gcgatggcgg taactacagc 180 

acccaatgga gcaataccgg taactttgta ggcggtaagg gttggagcac aggaagatcc 240 

aaccgcgtaa ttagttacaa tgctggtaac tggtcgccat cgggtaatgc ttacctatgt 300 

ttatatggct ggactaccaa cccgcttgtt gagtactacg tagttgatag ctggggttct 360 

tggagacctc ccggagcaac atcgcaggga acagtaaata ctgatggtgg cacctatgag 420 

atatacagaa ctcagcgtgt aaaccagcca tctattcagg ggaatactac tttctatcag 480 

tattggagcg ttagaacctc taaaagggcc actggaagca atgctaccat caccttccag 540 

aaccacgtaa atgcttgggc aagtaggggt tggaacttgg gagctcatag ctatcaggta 600 

ctggctaccg agggttatca gagcagcgga agttcaaata ttactgtttg ggaaggtggt 660 

tcaagtggag gttcttcagg tggaagcacc ggaggcagca ctggaggtgg atcacacgag 720 

atcattgtaa gagcccgtgg tgtagtaggt tcagagcaaa ttaggcttag ggttggcaat 780 

acaaccgttg caacttggac ccttactacc ggttataggg actatagggc tactacctca 840 

gctactggtg gtattctggt agagtacttc aatgatagcg gcaaccgtga tgttcagatt 900 

gattacatta gggtaaacgg ctcaactcgt caatctgaga acatgtcgta caatacaggg 960 

gtatggcaga atggctcatg cggcggctcc aatagcgagt ggctacactg caacggagct 1020 

attggctacg gcgatgtggt tactggcaga tcaaccgctg ttgaggaagc atttactgct 1080 
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gccgaggatt gtggctgtga acctaaggca accctattcc ccaaccctgc tggcagtacc 1140 

ctcagtatta tgctagacag gcaaccctat ggcgatgtaa gtattagaat atataatacg 1200 

gtaggtgcag ttgttcgcac catcaacaat ccagacctac tcactgaggt tgatgtcagt 1260 

gcattaaatt ctggaatcta ctttgtagag cttaggtccg aaggacatgt aagcaactac 1320 

aaatttatta aaaagtag ™ 1338 

<210> 368 
<211> 445 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 

<221> SIGNAL 
<222> CO... (23) 

<400> 368 

Met Lys Arg He Gly Leu Leu Phe Met Ala Leu Ala Leu Thr Ala Phe 

1 5 10 15 

Met Ala Gin His Ser Ser Ala Gin Arg lie cys Asn Asn Gin Thr Gly 

20 25 30 

Thr His Gly Gly Phe Tyr Tyr Thr Trp Trp ser Asp Gly Gly Gly Ser 

35 40 45 

Ala cys lie Thr Met Gly Asp Gly Gly Asn Tyr Ser Thr Gin Trp Ser 

50 55 60 

Asn Thr Gly Asn Phe Val Gly Gly Lys Gly Trp Ser Thr Gly Arg Ser 
65 70 75 80 

Asn Arg Val lie Ser Tyr Asn Ala Gly Asn Trp Ser Pro Ser Gly Asn 

85 90 95 

Ala Tyr Leu Cys Leu Tyr Gly Trp Thr Thr Asn Pro Leu val Glu Tyr 

100 105 110 

Tyr Val Val Asp Ser Trp Gly Ser Trp Arg Pro Pro Gly Ala Thr Ser 

115 120 125 

Gin Gly Thr Val Asn Thr Asp Gly Gly Thr Tyr Glu lie Tyr Arg Thr 

130 135 140 

Gin Arg val Asn Gin Pro Ser He Gin Gly Asn Thr Thr Phe Tyr Gin 
145 150 155 160 

Tyr Trp Ser val Arg Thr ser Lys Arg Ala Thr Gly ser Asn Ala Thr 

165 170 175 

lie Thr Phe Gin Asn His Val Asn Ala Trp Ala Ser Arg Gly Trp Asn 

180 185 190 

Leu Gly Ala His Ser Tyr Gin Val Leu Ala Thr Glu Gly Tyr Gin Ser 

195 200 205 

Ser Gly ser ser Asn lie Thr Val Trp Glu Gly Gly ser Ser Gly Gly 

210 215 220 

ser ser Gly Gly ser Thr Gly Gly Ser Thr Gly Gly Gly ser His Glu 
225 230 235 240 

He lie Val Arg Ala Arg Gly val val Gly Ser Glu Gin lie Arg Leu 

245 250 255 

Arg val Gly Asn Thr Thr Val Ala Thr Trp Thr Leu Thr Thr Gly Tyr 

260 265 270 

Arg Asp Tyr Arg Ala Thr Thr Ser Ala Thr Gly Gly lie Leu Val Glu 

275 280 285 

Tyr phe Asn Asp ser Gly Asn Arg Asp Val Gin lie Asp Tyr lie Arg 

290 295 ~ 300 

Val Asn Gly Ser Thr Arg Gin ser Glu Asn Met ser Tyr Asn Thr Gly 
305 310 315 320 

Val Trp Gin Asn Gly ser cys Gly Gly Ser Asn Ser Glu Trp Leu His 

, n 325 330 K 335 

cys Asn Gly Ala lie Gly Tyr Gly Asp val val Thr Gly Arg Ser Thr 

n 340 345 350 

Ala Val Glu Glu Ala phe Thr Ala Ala Glu Asp cys Gly cys Glu Pro 

, 355 360 365 

Lys Ala Thr Leu Phe Pro Asn Pro Ala Gly ser Thr Leu ser lie Met 

370 375 380 

Leu Asp Arg Gin Pro Tyr Gly Asp val ser lie Arg He Tyr Asn Thr 
385 390 395 400 

val Gly Ala val val Arg Thr lie Asn Asn Pro Asp Leu Leu Thr Glu 
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405 410 415 

val Asp val ser Ala Leu Asn Ser Gly lie Tyr Phe Val Glu Leu Arg 

420 425 430 

Ser Glu Gly His val Ser Asn Tyr Lys Phe He Lys Lys 
435 440 445 

<210> 369 
<211> 1077 
<212> DNA 
<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 369 

atgaaatcat ttatcactgg caaaaaaatt gctgctggac taattactgc agctgctttg 60 

agcgcatcta tggtgagcgc gcaaaccctg acttcaaatt ctcaaggcac ccacgacgga 120 

tttttctact ctttctggaa ggactcaggc aacgcctcaa tgaacttatt ggcgggcggc 180 

cgttatcagt ctagctggaa caccggcacc aacaactggg taggcggtaa aggctggaac 240 

ccaggcacta acaaccgtgt aattaactac tctggttact acggtgtgga caactcccaa 300 

aactcttacg tcgcgcttta cggctggacc agaaacccat tggttgagta ctacgtgatt 360 

gagagctacg gctcatacaa ccctgctagc tgctctggcg gcaccgattt cggtagcttc 420 

caaagtgacg gcgccaccta caacgtgcgt cgttgccagc gcgtgcaaca gccttcgatc 480 

gatggcaccc agactttcta ccaatacttc agcgtgagaa atccgaaaaa agggtttggg 540 

aacatttctg gcaccatcac ctttgctaac cacgtaaact actggagaag cagagggatg 600 

aatcttggta accacgatta ccaagttctc gctactgaag gctacagaag cacgggttct 660 

tctgacctca ccatcagcca aggcgcaagc aacaacggcg gtggcggcag tagctcaagt 720 

gctccatctg ctgggggcgg tagcaagaca atcgtcgtgc gggcacgcgg gactaccgga 780 

caagagcaaa tccgtttgcg ggtgaacaac actattgttc agacctggac cttgtccacc 840 

accatgcgcg actacaccgt caacactaac ttggcaggcg ggtcattggt tgaatacttc 900 

aatgacagcg gcaaccgcga cgtccaagtt gattacatca gcgtaaatgg caatgttcgc 960 

caatccgaaa accaaaccta caacaccggt gtctaccaga acggtgcgtg tggcggcggt 1020 

aacggccgga gcgagtggct ccattgcaac ggtgcaatcg ggtacggcga tatctaa 1077 

<210> 370 

<211> 358 

<212> PRT 

<213> Unknown 

<220> 

<223> obtained from an environmental sample. 

<221> SIGNAL 
<222> (1)... (27) 

<400> 370 

Met Lys ser Phe lie Thr Gly Lys Lys lie Ala Ala Gly Leu lie Thr 

1 5 10 15 

Ala Ala Ala Leu ser Ala ser Met val ser Ala Gin Thr Leu Thr ser 

20 25 30 

Asn Ser Gin Gly Thr His Asp Gly phe Phe Tyr Ser Phe Trp Lys Asp 

35 40 45 

Ser Gly Asn Ala Ser Met Asn Leu Leu Ala Gly Gly Arg Tyr Gin Ser 

50 55 60 

Ser Trp Asn Thr Gly Thr Asn Asn Trp val Gly Gly Lys Gly Trp Asn 
65 70 75 80 

Pro Gly Thr Asn Asn Arg Val lie Asn Tyr Ser Gly Tyr Tyr Gly val 

85 ^ 90 95 

Asp Asn Ser Gin Asn Ser Tyr Val Ala Leu Tyr Gly Trp Thr Arg Asn 

100 105 110 

Pro Leu val Glu Tyr Tyr val lie Glu ser Tyr Gly Ser Tyr Asn Pro 

115 120 125 

Ala ser cys ser Gly Gly Thr Asp Phe Gly Ser Phe Gin ser Asp Gly 

130 135 140 

Ala Thr Tyr Asn val Arg Arg Cys Gin Arg Val Gin Gin Pro ser lie 
145 150 155 160 

Asp Gly Thr Gin Thr Phe Tyr Gin Tyr Phe ser Val Arg Asn Pro Lys 

165 170 175 

Lys Gly Phe Gly Asn lie ser Gly Thr lie Thr Phe Ala Asn His val 
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180 185 190 

Asn Tyr Trp Arg ser Arg Gly Met Asn Leu Gly Asn His Asp Tyr Gin 

195 200 205 

Val Leu Ala Thr Glu Gly Tyr Arg Ser Thr Gly ser Ser Asp Leu Thr 

210 215 220 

He Ser Gin Gly Ala Ser Asn Asn Gly Gly Gly Gly ser Ser ser ser 
225 230 235 240 

Ala Pro Ser Ala Gly Gly Gly Ser Lys Thr He Val Val Arg Ala Arg 

245 250 255 

Gly Thr Thr Gly Gin Glu Gin He Arg Leu Arg Val Asn Asn Thr lie 

, , 260 , 2 65 270 

Val Gin Thr Trp Thr Leu ser Thr Thr Met Arg Asp Tyr Thr val Asn 

275 280 285 

Thr Asn Leu Ala Gly Gly ser Leu Val Glu Tyr Phe Asn Asp ser Gly 

290 295 300 

Asn Arg Asp Val Gin Val Asp Tyr He ser Val Asn Gly Asn val Arg 
3 9 5 , 310 315 320 

Gin ser Glu Asn Gin Thr Tyr Asn Thr Gly Val Tyr Gin Asn Gly Ala 

325 330 335 

Cys Gly Gly Gly Asn Gly Arg ser Glu Trp Leu His Cys Asn Gly Ala 

340 345 350 

lie Gly Tyr Gly Asp lie 
355 

<210> 371 

<211> 1245 

<212> DNA 

<213> Unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 371 

gtgaccggga tcgcgagaaa aggcgtatgg tccgtgattt ccggaacttt cactgccggg 60 

gattacgatt cctacctgct gtatgtcgaa acacaggacc agggcggcgg acacccgacg 120 

ctgagctttg aaatccggaa cttcagactg acggcaccgg aaggcatcgc tccgccgaag 180 

gcgacagaag aaccggctga cgcggcagag gcgacgcctg ttccggcact gagcgagatt 240 

ccgggcctga aggacgtcta cgcggactac tttgacttcg gcgctgcggc gccgcagtat 300 

gcattcggcc tcggccagac ccagctgcag gacctgatga tcagccagtt cagcatcctg 360 

acccctgaaa acgaactgaa accggacagc gtgcttgatg tccagacgag taaaaaactg 420 

gcggcagaag acgaaaccgc ggtggcgatc aggctgaacg ccgcaacgcc gctgctgaag 480 

ttcgcgcaga agaacggcat caaagtgcac ggccatgtgc tggtatggca cagccagacg 540 

ccggaagctt tcttccatga aggatacgat accaagaaac cctatgtgac gagagaggtt 600 

atgctcggcc gcctggaaaa ctatatccgt gaagtgctga cgcagacaga ggaacagttc 660 

ccgggcgtga tcgtcagctg ggacgtcgtg aacgaggcga tcgacgacgg tactcactgg 720 

ctgcggaaga cttccagctg gtacaaagtc gtcggcgagg atttcctgaa cagggctttt 780 

gaatacgcca ggaaatacgc cgcggagggc gtgctgctgt actacaacga ttacagcacg 840 

gcaaattcgg ctaaactgat gggcatcacg aagctgctga agcagctgat tccagacggg 900 

aatatcgacg gctacggatt ccagatgcac catgacctcg gctggccgag catcgacctt 960 

atggcggcag ctgtgaagca gattgccggc ctggggctga aactgcgcgt cagcgaactg 1020 

gatatcggcg tatccaagaa caatcaggaa aactatgaca aacaggccaa acgctacaag 1080 

gaaatgctga acctgatgct gcagtacgcg gaccagacgg aagccgtgca ggtctggggc 1140 

ctgacggaca acatgagctg gagaaccggc aaatacccgc tgctgttcga cagcgcggca 1200 

aaaccgaaaa aggcgttctt cgcggtgatt gaagccgcag aggaa 1245 

<210> 372 
<211> 415 
<212> prt 
<213> unknown 

<220> 

<223> obtained from an environmental sample. 
<400> 372 

Met Thr Gly lie Ala Arg Lys Gly val Trp ser val lie Ser Gly Thr 

15 10 15 

Phe Thr Ala Gly Asp Tyr Asp ser Tyr Leu Leu Tyr val Glu Thr Gin 

20 25 30 

Asp Gin Gly Gly Gly His Pro Thr Leu Ser Phe Glu lie Arg Asn Phe 
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35 40 45 

Arg Leu Thr Ala Pro 61 u 61 y He Ala Pro Pro Lys Ala Thr Glu Glu 

50 55 60 

Pro Ala Asp Ala Ala Glu Ala Thr Pro Val Pro Ala Leu Ser Glu lie 
65 70 75 80 

Pro Gly Leu Lys Asp Val Tyr Ala Asp Tyr Phe Asp Phe Gly Ala Ala 

85 90 95 

Ala Pro Gin Tyr Ala Phe Gly Leu Gly Gin Thr Gin Leu Gin Asp Leu 

100 105 110 

Met lie ser Gin Phe Ser He Leu Thr Pro Glu Asn Glu Leu Lys Pro 

115 120 125 

Asp ser val Leu Asp val Gin Thr ser Lys Lys Leu Ala Ala Glu Asp 

130 135 140 

Glu Thr Ala Val Ala lie Arg Leu Asn Ala Ala Thr Pro Leu Leu Lys 
145 n n 150 155 160 

Phe Ala Gin Lys Asn Gly He Lys Val His Gly His Val Leu val Trp 

165 170 175 

His ser Gin Thr Pro Glu Ala Phe Phe His Glu Gly Tyr Asp Thr Lys 

180 185 190 

Lys Pro Tyr val Thr Arg Glu Val Met Leu Gly Arg Leu Glu Asn Tyr 

195 200 205 

lie Arg Glu Val Leu Thr Gin Thr Glu Glu Gin Phe Pro Gly val lie 

210 215 220 

val ser Trp Asp Val Val Asn Glu Ala lie Asp Asp Gly Thr His Trp 
225 230 235 240 

Leu Arg Lys Thr ser ser Trp Tyr Lys Val Val Gly Glu Asp Phe Leu 

245 250 255 

Asn Arg Ala Phe Glu Tyr Ala Arg Lys Tyr Ala Ala Glu Gly val Leu 

260 265 270 

Leu Tyr Tyr Asn Asp Tyr Ser Thr Ala Asn ser Ala Lys Leu Met Gly 

, . 275 280 285 

He Thr Lys Leu Leu Lys Gin Leu lie Pro Asp Gly Asn lie Asp Gly 

290 295 300 

Tyr Gly Phe Gin Met His His Asp Leu Gly Trp Pro Ser He Asp Leu 
305 310 315 320 

Met Ala Ala Ala Val Lys Gin lie Ala Gly Leu Gly Leu Lys Leu Arq 

325 330 335 

Val Ser Glu Leu Asp lie Gly Val ser Lys Asn Asn Gin Glu Asn Tyr 

340 345 350 

Asp Lys Gin Ala Lys Arg Tyr Lys Glu Met Leu Asn Leu Met Leu Gin 

355 360 365 

Tyr Ala Asp Gin Thr Glu Ala val Gin val Trp Gly Leu Thr Asp Asn 

370 375 380 

Met ser Trp Arg Thr Gly Lys Tyr Pro Leu Leu Phe Asp ser Ala Ala 
385 390 395 400 

Lys Pro Lys Lys Ala Phe Phe Ala Val lie Glu Ala Ala Glu Glu 
405 410 415 

<210> 373 
<211> 1539 
<212> DNA 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 373 

ttgattggct gcgtcatgtc gccgccggaa gcgggaagtc cccgttttga tcttttaacc 60 

cggcacttta atgtcatcac cgcggaaaac gccatgaagc ccgcgtcgtt gcagcgcgaa 120 

aagggggtgt ttacttttga acaggcggac atgatggtgg acgcggtatt ggagcgggga 180 

ctgaagatcc acggacatac tctggcctgg caccagcagt ctccggagtg gatgaatcat 240 

gaggggattt cccgggacga agccgtggaa aatctcaccg tccacgccaa aaccgcggcc 300 

gctcatttta gggggcgggt catatcctgg gatgtactca acgaggcgat cattgacaat 360 

ccccccaacc ccggggattg gcgggcatcc ctcaggcaaa gcccctggta caaagccata 420 

ggcccggatt acgtggagct tgtgttcaag gcggccaggg aggcggaccc ggaggcaaaa 480 

ctttattata acgattacaa ccttgataac cggaacaagg ccctggcggt ttacaacatg 540 

gtcagggaac tgaacgaaaa gaatccgaat ccgggcggca ggcccctcat cgacggcgtg 600 

ggcatgcagg gccattaccg cctgaatacc aataccgata acgtgaggct gtcgctggaa 660 

cggtttattt ccctgggggt cgaggtcagc atcacggagc tcgatataca ggccggttcg 720 
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gattcaaacc agacagagcg gcagcgggtg gaacagggcc tggtctatgc cgctttgttt 780 

accattttcc gggaacacgc ggcaaacata ggccgggtaa ctttttgggg acttgacgac 840 

ggggcaagct ggcgttccgc ggcgagtccc tgcctctttg ataaaaacct caacgcaaaa 900 

cctgcctttt acgcggtcct ggacccggat tcctttattg cggaaaacag cgccctgctg 960 

atcagggaag cgaaagaggg agaggcttat tatggtacgc ctgctttagg cgccgtccct 1020 

gatcccctct gggacagggc gccttccctc ccggtggatc agtacctcat ggcctggcag 1080 

ggcgcttcgg gaagggcaaa agtcctctgg gacgaaaaaa atctctatgt gctggtccgg 1140 

gttgaaaacg cggaaataaa caaggacagt tccaacagct acgaacagga ttcggtcgaa 1200 

atttttattg atgaggataa ccggaaaagt tcctttttca gggaggatga cgggcagtac 1260 

cgggtcaatt ttgccaacga ggcgggcttt aacccctcgt ccgccggggc ggggtttgtt 1320 

tcggccgccg cggtggatgg aaaatcctat accgttacca tgaagattcc ctttaaaaca 1380 

atagtccccg gagcggggac gcgtatcggg tttgatgtcc agatcaacgg cgcgtcggcc 1440 

agggggatac gggagagcgt ggcggtatgg aatgatacca cgggcaattc atttcaggat 1500 

acctcaggtt acggggtact gcggttagta aaaaagtaa 1539 

<210> 374 
<211> 512 
<212> PRT 
<213> Unknown 

<220> 

<223> Obtained from an environmental sample. 
<400> 374 

Met lie Gly Cys Val Met ser Pro Pro Glu Ala Gly ser pro Arg Phe 

1 5 10 15 

Asp Leu Leu Thr Arg His Phe Asn Val lie Thr Ala Glu Asn Ala Met 

20 ~ 25 30 

Lys Pro Ala Ser Leu Gin Arg Glu Lys Gly Val Phe Thr Phe Glu Gin 

35 40 45 

Ala Asp Met Met val Asp Ala Val Leu Glu Arg Gly Leu Lys He His 

50 55 60 

Gly His Thr Leu Ala Trp His Gin Gin ser Pro Glu Trp Met Asn His 
65 70 75 80 

Glu Gly lie Ser Arg Asp Glu Ala Val Glu Asn Leu Thr Val His Ala 

85 90 95 

Lys Thr Ala Ala Ala His Phe Arg Gly Arg Val lie Ser Trp Asp val 

100 105 110 

Leu Asn Glu Ala lie lie Asp Asn Pro Pro Asn Pro Gly Asp Trp Arg 

115 120 125 

Ala ser Leu Arg Gin ser pro Trp Tyr Lys Ala He Gly Pro Asp Tyr 

130 135 140 

Val Glu Leu val Phe Lys Ala Ala Arg Glu Ala Asp Pro Glu Ala Lys 
145 150 " 155 160 

Leu Tyr Tyr Asn Asp Tyr Asn Leu Asp Asn Arg Asn Lys Ala Leu Ala 

165 170 175 

val Tyr Asn Met val Arg Glu Leu Asn Glu Lys Asn Pro Asn pro Gly 

180 185 190 

Gly Arg pro Leu lie Asp Gly val Gly Met Gin Gly His Tyr Arg Leu 

195 200 205 

Asn Thr Asn Thr Asp Asn Val Arg Leu ser Leu Glu Arg Phe lie Ser 

210 215 220 

Leu Gly val Glu Val ser lie Thr Glu Leu Asp He Gin Ala Gly Ser 
225 230 235 240 

Asp ser Asn Gin Thr Glu Arg Gin Arg val Glu Gin Gly Leu Val Tyr 

245 * 250 255 

Ala Ala Leu Phe Thr lie Phe Arg Glu His Ala Ala Asn lie Gly Arg 

260 265 270 

Val Thr Phe Trp Gly Leu Asp Asp Gly Ala Ser Trp Arg ser Ala Ala 

275 280 285 

Ser Pro cys Leu Phe Asp Lys Asn Leu Asn Ala Lys Pro Ala Phe Tyr 

290 295 300 

Ala val Leu Asp Pro Asp ser Phe lie Ala Glu Asn Ser Ala Leu Leu 
305 310 315 320 

lie Arg Glu Ala Lys Glu Gly Glu Ala Tyr Tyr Gly Thr Pro Ala Leu 

325 330 335 

Gly Ala Val Pro Asp Pro Leu Trp Asp Arg Ala Pro ser Leu Pro val 

340 345 350 

Asp Gin Tyr Leu Met Ala Trp Gin Gly Ala Ser Gly Arg Ala Lys Val 

Page 289 



WO 03/106654 PCT/US03/19153 

355 360 365 

Leu Trp Asp Glu Lys Asn Leu Tyr Val Leu Val Arg Val Glu Asn Ala 
„, |70 375 380 

Glu lie Asn Lys Asp Ser Ser Asn ser Tyr Glu Gin Asp ser val Glu 
3 ?5 390 395 400 

He Phe lie Asp Glu Asp Asn Arg Lys Ser ser Phe Phe Arg Glu asd 

, , 405 410 415 

Asp Gly Gin Tyr Arg val Asn Phe Ala Asn Glu Ala Gly Phe Asn Pro 

, 420 425 430 

ser ser Ala Gly Ala Gly Phe val ser Ala Ala Ala Val Asp Gly Lys 

435 440 445 

Ser Tyr Thr val Thr Met Lys lie Pro Phe Lys Thr lie Val Pro Gly 

450 455 460 

Ala Gly Thr Arg lie Gly Phe Asp Val Gin lie Asn Gly Ala ser Ala 
465 470 475 480 

Arg Gly lie Arg Glu Ser Val Ala Val Trp Asn Asp Thr Thr Gly Asn 

485 490 495 

Ser Phe Gin Asp Thr ser Gly Tyr Gly val Leu Arg Leu Val Lys Lys 
500 505 510 

<210> 375 
<211> 570 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> synthetically generated polynucleotide 
<400> 375 

atggccctta tggcttcgac attctactgg cacttgtgga ctgatggtat agggacagta 60 
aatgctacca atggatctga tggcaattac agcgtttcat ggtcaaattg cgggaatttt 120 
gttgttggta aaggctggac taccggatca gcaactaggg taataaacta taatgcccac 180 
gccttttcgg tagtgggtaa tgcttatttg gctctttatg ggtggacgag aaattcactc 240 
atagaatatt acgtcgttga tagctggggg acttatagac ctactggaac ttataaaggc 300 
actgtgacta gtgatggagg gacttatgac atatacacga ctacacgaac caacgcacct 360 
tccattgacg gcaataatac aactttcacc cagttctgga gtgttaggca gtcgaagaga 420 
ccgattggta ccaacaatac catcaccttt agcaaccatg ttaacgcctg gaagagtaaa 480 
ggaatgaatt tggggagtag ttggtcttat caggtattag caacagaggg ctatcaaagt 540 
agtgggtact ctaacgtaac ggtctggtaa y y 579 

<210> 376 
<211> 189 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Synthetically generated polypeptide 
<400> 376 

Met Ala Leu Met Ala Ser Thr Phe Tyr Trp His Leu Trp Thr Asp Gly 

1,5 10 15 

lie Gly Thr Val Asn Ala Thr Asn Gly ser Asp Gly Asn Tyr ser val 

20 25 30 

Ser Trp ser Asn Cys Gly Asn Phe Val Val Gly Lys Gly Trp Thr Thr 

35 40 45 

Gly ser Ala Thr Arg val lie Asn Tyr Asn Ala His Ala Phe Ser Val 

50 55 60 

Val Gly Asn Ala Tyr Leu Ala Leu Tyr Gly Trp Thr Arg Asn ser Leu 
65 70 75 80 

He Glu Tyr Tyr Val val Asp Ser Trp Gly Thr Tyr Arg Pro Thr Gly 

85 90 * " 95 

Thr Tyr Lys Gly Thr val Thr Ser Asp Gly Gly Thr Tyr Asp lie Tyr 

, , 100 105 * lib 

Thr Thr Thr Arg Thr Asn Ala Pro ser lie Asp Gly Asn Asn Thr Thr 

US 120 125 

Phe Thr Gin Phe Trp ser val Arg Gin ser Lys Arg pro lie Gly Thr 

130 135 14q 

Asn Asn Thr lie Thr phe Ser Asn His Val Asn Ala Trp Lys ser Lys 
145 150 155 160 
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Gly Met Asn Leu Gly Ser Ser Trp Ser Tyr Gin Val Leu Ala Thr Glu 

165 170 175 

Gly Tyr Gin ser Ser Gly Tyr ser Asn Val Thr Val Trp 
180 185 

<210> 377 
<211> 570 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Synthetically generated polynucleotide 
<400> 377 

atggccctta tggcttcgac attctactgg cacttgtgga ctgatggtat agggacagta 60 
aatgctacca atggatctga tggcaattac agcgtttcat ggtcaaattg cgggaatttt 120 
gttgttggta aaggctggac taccggatca gcaactaggg taataaacta taatgcccac 180 
gccttttcgg tagtgggtaa tgcttatttg gctctttatg ggtggacgag aaatccactc 240 
atagaatatt acgtcgttga tagctggggg acttatagac ctactggaac ttataaaggc 300 
actgtgacta gtgatggagg gacttatgac atatacacga ctacacgaac caacgcacct 360 
tccattgacg gcaataatac aactttcacc cagttctgga gtgttaggca gtcgaagaga 420 
ccgattggta ccaacaatac catcaccttt agcaaccatg ttaacgcctg gaagagtaaa 480 
ggaatgaatt tggggagtag ttggtcttat caggtattag caacagaggg ctatcaaagt 540 
agtgggtact ctaacgtaac ggtctggtaa ^ 570 

<210> 378 
<211> 189 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Synthetically generated polypeptide 
<400> 378 

Met Ala Leu Met Ala Ser Thr Phe Tyr Trp His Leu Trp Thr Asp Gly 

1 5 10 15 

lie Gly Thr Val Asn Ala Thr Asn Gly ser Asp Gly Asn Tyr ser Val 

20 25 30 

ser Trp ser Asn Cys Gly Asn Phe Val Val Gly Lys Gly Trp Thr Thr 

35 40 45 

Gly ser Ala Thr Arg val lie Asn Tyr Asn Ala His Ala Phe ser Val 

50 55 60 

Val Gly Asn Ala Tyr Leu Ala Leu Tyr Gly Trp Thr Arg Asn pro Leu 
65 70 75 80 

lie Glu Tyr Tyr Val Val Asp Ser Trp Gly Thr Tyr Arg Pro Thr Gly 

85 90 95 

Thr Tyr Lys Gly Thr Val Thr Ser Asp Gly Gly Thr Tyr Asp He Tyr 

100 105 110 

Thr Thr Thr Arg Thr Asn Ala Pro Ser lie Asp Gly Asn Asn Thr Thr 

. , 120 12 5 

Phe Thr Gin Phe Trp ser val Arg Gin ser Lys Arg Pro lie Gly Thr 

130 135 140 

Asn Asn Thr lie Thr Phe Ser Asn His Val Asn Ala Trp Lys Ser Lys 
145 150 155 160 

Gly Met Asn Leu Gly ser ser Trp ser Tyr Gin Val Leu Ala Thr Glu 

165 170 175 

Gly Tyr Gin Ser ser Gly Tyr ser Asn val Thr val Trp 
180 185 

<210> 379 
<211> 570 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> synthetically generated polynucleotide. 
<400> 379 

atggccctta tggcttcgac attctactgg cacaattgga ctgatggtat agggacagta 60 
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aatgctacca atggatctga tggcaattac agcgtttcat ggtcaaattg cgggaatttt 120 

gttgttggta aaggctggac taccggatca gcaactaggg taataaacta taatgcccac 180 

gccttttcgc cggtgggtaa tgcttatttg gctctttatg ggtggacgag aaattcactc 240 

atagaatatt acgtcgttga tagctggggg acttatagac ctactggaac ttataaaggc 300 

actgtgacta gtgatggagg gacttatgac atatacacga ctacacgaac caacgcacct 360 

tccattgacg gcaataatac aactttcacc cagttctgga gtgttaggca gtcgaagaga 420 

ccgattggta ccaacaatac catcaccttt agcaaccatg ttaacgcctg gaagagtaaa 480 

ggaatgaatt tggggagtag ttggtcttat caggtattag caacagaggg ctatcaaagt 540 

agtgggtact ctaacgtaac ggtctggtaa 570 

<210> 380 
<211> 189 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> synthetically generated polypeptide. 
<400> 380 . 

Met Ala Leu Met Ala ser Thr Phe Tyr Trp His Asn Trp Thr Asp Gly 

1 5 10 15 

lie Gly Thr Val Asn Ala Thr Asn Gly ser Asp Gly Asn Tyr ser val 

20 25 n 30 

Ser Trp ser Asn cys Gly Asn Phe val Val Gly Lys Gly Trp Thr Thr 

35 40 45 

Gly ser Ala Thr Arg val lie Asn Tyr Asn Ala His Ala Phe Ser Pro 

50 55 n 60 

val Gly Asn Ala Tyr Leu Ala Leu Tyr Gly Trp Thr Arg Asn ser Leu 
65 70 75 80 

He Glu Tyr Tyr Val val Asp ser Trp Gly Thr Tyr Arg Pro Thr Gly 

85 90 95 

Thr Tyr Lys Gly Thr Val Thr ser Asp Gly Gly Thr Tyr Asp lie Tyr 

100 105 110 

Thr Thr Thr Arg Thr Asn Ala Pro ser He Asp Gly Asn Asn Thr Thr 

115 120 125 

Phe Thr Gin Phe Trp ser Val Arg Gin Ser Lys Arg Pro He Gly Thr 

130 135 140 

Asn Asn Thr lie Thr Phe Ser Asn His val Asn Ala Trp Lys Ser Lys 
145 150 155 160 

Gly Met Asn Leu Gly Ser Ser Trp Ser Tyr Gin Val Leu Ala Thr Glu 

165 170 175 

Gly Tyr Gin Ser Ser Gly Tyr Ser Asn val Thr Val Trp 
180 185 
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